Search result diversification for selective search

Download
2019
Küçükoglu, Emre Can
Our work explores the performance of result diversification methods in the selective search scenario, where the underlying document collection is topically partitioned across several nodes and the search is conducted only at a subset of these nodes. In particular, we investigate whether diversification at each node is superior to previous approaches in the literature, i.e., diversification at the broker node applied before the resource selection or after the result merging stages. We also compare performance of different centralized sample indexes to show their effect on diversification. Fi- nally, we explore the impact of recently introduced query expansion techniques using word embeddings to improve the effectiveness of diversification applied at the broker node, and subsequently, overall diversification. Our experiments reveal that for im- plicit diversification methods, expanding queries with diversified terms and applying diversification during the resource selection stage yield the best performance. In con- trary, for explicit diversification methods, diversifying merged results at the broker is the best solution.

Suggestions

Efficient processing of category-restricted queries for Web directories
Altıngövde, İsmail Sengör; Ulusoy, Oezguer (2008-01-01)
We show that a cluster-skipping inverted index (CS-IIS) is a practical and efficient file structure to support category-restricted queries for searching Web directories. The query processing strategy with CS-IIS improves CPU time efficiency without imposing any limitations on the directory size.
Cost-aware result caching strategies for meta-search engines
Bakkal, Emre; Altıngövde, İsmail Sengör; Department of Computer Engineering (2015)
Meta-search engines are tools that generate top-k search results of a query by combining local top-k search results retrieved from various data sources in parallel. A result cache that stores the results of the previously seen queries is a crucial component in a meta-search engine to improve the efficiency, scalability and availability of the system. Our goal in this thesis is to design and analyze different cost-aware and dynamic result caching strategies to be used in meta-search engines. To this end, as ...
Using object-oriented materialized views to answer selection-based complex queries
Alhajj, R; Polat, Faruk (1999-09-01)
Presented in this paper is a model that utilizes existing materialized views to handle a wide range of complex selection-based queries, including linear recursive queries. Such queries are complex because it is almost impossible for naive users to predict the formulation of their predicate expressions. Object variables bound to objects in the result of a query are allowed to appear in the predicate of that query. Also, the predicate definition is extended to make it possible to have in the output only a sub...
Real-Time Moving Target Search
Undeger, Cagatay; Polat, Faruk (2007-11-23)
In this paper, we propose a real-time moving target search algorithm for dynamic and partially observable environments, modeled as grid world. The proposed algorithm, Real-time Moving Target Evaluation Search (MTES), is able to detect the closed directions around the agent, and determine the best direction that avoids the nearby obstacles, leading to a moving target which is assumed to be escaping almost optimally. We compared our proposal with Moving Target Search (NITS) and observed a significant improvem...
A five-level static cache architecture for web search engines
Ozcan, Rifat; Altıngövde, İsmail Sengör; Barla Cambazoglu, B.; Junqueira, Flavio P.; Ulusoy, Ozgur (2012-09-01)
Caching is a crucial performance component of large-scale web search engines, as it greatly helps reducing average query response times and query processing workloads on backend search clusters. In this paper, we describe a multi-level static cache architecture that stores five different item types: query results, precomputed scores, posting lists, precomputed intersections of posting lists, and documents. Moreover, we propose a greedy heuristic to prioritize items for caching, based on gains computed by us...
Citation Formats
E. C. Küçükoglu, “Search result diversification for selective search,” Thesis (M.S.) -- Graduate School of Natural and Applied Sciences. Computer Engineering., Middle East Technical University, 2019.