A Cost-Aware Strategy for Query Result Caching in Web Search Engines

Download
2009-01-01
Search engines and large scale IR systems need to cache query results for efficiency and scalability purposes. In this study, we propose to explicitly incorporate the query costs in the static caching policy. To this end, a query’s cost is represented by its execution time, which involves CPU time to decompress the postings and compute the query-document similarities to obtain the final top-N answers. Simulation results using a large Web crawl data and a real query log reveal that the proposed strategy improves overall system performance in terms of the total query execution time.

Suggestions

A five-level static cache architecture for web search engines
Ozcan, Rifat; Altıngövde, İsmail Sengör; Barla Cambazoglu, B.; Junqueira, Flavio P.; Ulusoy, Ozgur (2012-09-01)
Caching is a crucial performance component of large-scale web search engines, as it greatly helps reducing average query response times and query processing workloads on backend search clusters. In this paper, we describe a multi-level static cache architecture that stores five different item types: query results, precomputed scores, posting lists, precomputed intersections of posting lists, and documents. Moreover, we propose a greedy heuristic to prioritize items for caching, based on gains computed by us...
Improving the performance of Hadoop/Hive by sharing scan and computation tasks
Özal, Serkan; Toroslu, İsmail Hakkı; Doğaç, Asuman; Department of Computer Engineering (2013)
MapReduce is a popular model of executing time-consuming analytical queries as a batch of tasks on large scale data. During simultaneous execution of multiple queries, many oppor- tunities can arise for sharing scan and/or computation tasks. Executing common tasks only once can reduce the total execution time of all queries remarkably. Therefore, we propose to use Multiple Query Optimization (MQO) techniques to improve the overall performance of Hadoop Hive, an open source SQL-based distributed warehouse sy...
Cost-Aware Strategies for Query Result Caching in Web Search Engines
Ozcan, Rifat; Altıngövde, İsmail Sengör; Ulusoy, Ozgor (Association for Computing Machinery (ACM), 2011-05-01)
Search engines and large-scale IR systems need to cache query results for efficiency and scalability purposes. Static and dynamic caching techniques (as well as their combinations) are employed to effectively cache query results. In this study, we propose cost-aware strategies for static and dynamic caching setups. Our research is motivated by two key observations: (i) query processing costs may significantly vary among different queries, and (ii) the processing cost of a query is not proportional to its po...
A transcoding robust data hiding method for image communication applications
Candan, Çağatay (2005-09-14)
We present a data embedding method for image communication applications. Our goal is to implement novel multimedia applications such as multi-language captions, interactive programming and title specific features over the existing image communication channel. To this aim, we present a data embedding method for JPEG images which has the desired degree of robustness to transcoding or bitrate adjustments that may take place in the communication channel. The described system is designed for JPEG images but can ...
Utilization of navigational queries for result presentation and caching in search engines
Ozcan, Rifat; Altıngövde, İsmail Sengör; Ulusoy, Özgür (2008-12-01)
We propose result page models with varying granularities for navigational queries and show that this approach provides a better utilization of cache space and reduces bandwidth requirements.
Citation Formats
İ. S. Altıngövde and O. Ulusoy, “A Cost-Aware Strategy for Query Result Caching in Web Search Engines,” 2009, vol. 5478, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/33161.