Site-based dynamic pruning for query processing in search engines

Download
2008-12-15
Altıngövde, İsmail Sengör
Can, Fazli
Ulusoy, Özgür
Web search engines typically index and retrieve at the page level. In this study, we investigate a dynamic pruning strategy that allows the query processor to first determine the most promising websites and then proceed with the similarity computations for those pages only within these sites.

Suggestions

Static index pruning in web search engines
Altıngövde, İsmail Sengör; Ulusoy, Özgür (Association for Computing Machinery (ACM), 2012-2-1)
Static index pruning techniques permanently remove a presumably redundant part of an inverted file, to reduce the file size and query processing time. These techniques differ in deciding which parts of an index can be removed safely; that is, without changing the top-ranked query results. As defined in the literature, the query view of a document is the set of query terms that access to this particular document, that is, retrieves this document among its top results. In this paper, we first propose using qu...
A Cost-Aware Strategy for Query Result Caching in Web Search Engines
Altıngövde, İsmail Sengör; Ulusoy, Oezguer (2009-01-01)
Search engines and large scale IR systems need to cache query results for efficiency and scalability purposes. In this study, we propose to explicitly incorporate the query costs in the static caching policy. To this end, a query’s cost is represented by its execution time, which involves CPU time to decompress the postings and compute the query-document similarities to obtain the final top-N answers. Simulation results using a large Web crawl data and a real query log reveal that the proposed strategy impr...
Utilizing query performance predictors for early termination in meta-search
Şener, Emre; Altıngövde, İsmail Sengör; Department of Computer Engineering (2016)
In the context of web, a meta-search engine is a system that forwards an incoming user query to all the component search engines (aka, resources); and then merges the retrieved results. Given that hundreds of such resources may exist, it is mandatory for a meta-search engine to avoid forwarding a query to all available resources, but rather focus on a subset of them. In this thesis, we first introduce a novel incremental query forwarding strategy for meta-search. More specifically, given a ranked list of N ...
Impact of Regionalization on Performance of Web Search Engine Result Caches
Cambazoglu, B. Barla; Altıngövde, İsmail Sengör (2012-01-01)
Large-scale web search engines are known to maintain caches that store the results of previously issued queries. They are also known to customize their search results in different forms to improve the relevance of their results to a particular group of users. In this paper, we show that the regionalization of search results decreases the hit rates attained by a result cache. As a remedy, we investigate result prefetching strategies that aim to recover the hit rate sacrificed to search result regionalization...
Cost-aware result caching strategies for meta-search engines
Bakkal, Emre; Altıngövde, İsmail Sengör; Department of Computer Engineering (2015)
Meta-search engines are tools that generate top-k search results of a query by combining local top-k search results retrieved from various data sources in parallel. A result cache that stores the results of the previously seen queries is a crucial component in a meta-search engine to improve the efficiency, scalability and availability of the system. Our goal in this thesis is to design and analyze different cost-aware and dynamic result caching strategies to be used in meta-search engines. To this end, as ...
Citation Formats
İ. S. Altıngövde, F. Can, and Ö. Ulusoy, “Site-based dynamic pruning for query processing in search engines,” 2008, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/34504.