Effective & efficient methods for web search result diversification

Özdemiray, Ahmet Murat
Search result diversification is one of the key techniques to cope with the ambiguous and/or underspecified information needs of the web users. In this study we first extensively evaluate the performance of a state-of-the-art explicit diversification strategy and pin-point its weaknesses. We propose basic yet novel optimizations to remedy these weaknesses and boost the performance of this algorithm. Secondly, we cast the diversification problem to the problem of ranking aggregation and propose to materialize the re-rankings of the candidate documents for each query aspect and then merge these rankings by adapting the score(-based) and rank(-based) aggregation methods. As a third contribution, for the first time in the literature, we propose using post-retrieval query performance predictors (QPPs) to estimate, for each aspect, the retrieval effectiveness on the candidate document set, and leverage these estimations to set the aspect weights. In addition to utilizing well-known QPPs from the literature, we also introduce three new QPPs that are based on score distributions and hence, can be employed for online query processing in real-life search engines. For the last contribution, we use retrieval performance predictions of query aspects to selectively expand those aspects that perform below some threshold, using the top retrieved documents of the aspect's own results. Our extensive experimental evaluations show that, despite having lower computational complexity than the state-of-the-art diversification strategies, certain ranking aggregation methods are superior to the existing explicit diversification strategies in terms of the diversification effectiveness. Furthermore, using QPPs for aspect weighting improves almost all state-of-the-art diversification algorithms in comparison to using a uniform weight estimator and also the proposed QPPs are comparable or superior to the existing predictors in the context of aspect weighting. Lastly, using QPP methods to selectively expand the query aspects provide better diversification performance compared to unexpanded or fully expanded aspects, for most of the diversification strategies.


Explicit Search Result Diversification Using Score and Rank Aggregation Methods
Ozdemiray, Ahmet Murat; Altıngövde, İsmail Sengör (2015-06-01)
Search result diversification is one of the key techniques to cope with the ambiguous and underspecified information needs of web users. In the last few years, strategies that are based on the explicit knowledge of query aspects emerged as highly effective ways of diversifying search results. Our contributions in this article are two-fold. First, we extensively evaluate the performance of a state-of-the-art explicit diversification strategy and pin-point its potential weaknesses. We propose basic yet novel ...
Utilizing query performance predictors for early termination in meta-search
Şener, Emre; Altıngövde, İsmail Sengör; Department of Computer Engineering (2016)
In the context of web, a meta-search engine is a system that forwards an incoming user query to all the component search engines (aka, resources); and then merges the retrieved results. Given that hundreds of such resources may exist, it is mandatory for a meta-search engine to avoid forwarding a query to all available resources, but rather focus on a subset of them. In this thesis, we first introduce a novel incremental query forwarding strategy for meta-search. More specifically, given a ranked list of N ...
Advanced methods for diversification of results in general-purpose and specialized search engines
Yiğit Sert, Sevgi; Altıngövde, İsmail Sengör; Ulusoy, Özgür; Department of Computer Engineering (2020-12-28)
Diversifying search results is a common mechanism in information retrieval to satisfy more users by surfacing documents that address different possible intentions of users. It aims to generate a result list that is both relevant and diverse when ambiguous and/or broad queries appear. Such queries have different underlying subtopics (a.k.a., aspects or interpretations) that search result diversification algorithms should consider. In this thesis, we first address search result diversification as a useful met...
Advanced methods for result and score caching in web search engines
Yafay, Erman.; Altıngövde, İsmail Sengör; Department of Computer Engineering (2019)
Search engines employ caching techniques in main memory to improve system efficiency and scalability. In this thesis, we focus on improving the cache performance for web search engines where our contributions can be separated into two main parts. Firstly, we investigate the impact of the sample size for frequency statistics for most popular cache eviction strategies in the literature, and show that cache performance improves with larger samples, i.e., by storing the frequencies of all (or, most of) the quer...
Limitations and improvement opportunities for implicit result diversification in search engines
Ulu, Yaşar Barış; Altıngövde, İsmail Sengör; Department of Computer Engineering (2019)
Search engine users essentially expect to find the relevant results for their query. Additionally, the results of the query should contain different possible query intents, which leads to the well-known problem of search result diversification. Our work first investigates the limitations of implicit search result diversification, and in particular, reveals that typical optimization tricks (such as clustering) may not necessarily improve the diversification effectiveness. Then, as our second contribution, we...
Citation Formats
A. M. Özdemiray, “Effective & efficient methods for web search result diversification,” Ph.D. - Doctoral Program, Middle East Technical University, 2015.