Supervised approaches for explicit search result diversification

2020-11-01
Yigit-Sert, Sevgi
Altıngövde, İsmail Sengör
Macdonald, Craig
Ounis, Iadh
Ulusoy, Özgür
Diversification of web search results aims to promote documents with diverse content (i.e., covering different aspects of a query) to the top-ranked positions, to satisfy more users, enhance fairness and reduce bias. In this work, we focus on the explicit diversification methods, which assume that the query aspects are known at the diversification time, and leverage supervised learning methods to improve their performance in three different frameworks with different features and goals. First, in the LTRDiv framework, we focus on applying typical learning to rank (LTR) algorithms to obtain a ranking where each top-ranked document covers as many aspects as possible. We argue that such rankings optimize various diversification metrics (under certain assumptions), and hence, are likely to achieve diversity in practice. Second, in the AspectRanker framework, we apply LTR for ranking the aspects of a query with the goal of more accurately setting the aspect importance values for diversification. As features, we exploit several pre- and post-retrieval query performance predictors (QPPs) to estimate how well a given aspect is covered among the candidate documents. Finally, in the LmDiv framework, we cast the diversification problem into an alternative fusion task, namely, the supervised merging of rankings per query aspect. We again use QPPs computed over the candidate set for each aspect, and optimize an objective function that is tailored for the diversification goal. We conduct thorough comparative experiments using both the basic systems (based on the well-known BM25 matching function) and the best-performing systems (with more sophisticated retrieval methods) from previous TREC campaigns. Our findings reveal that the proposed frameworks, especially AspectRanker and LmDiv, outperform both non-diversified rankings and two strong diversification baselines (i.e., xQuAD and its variant) in terms of various effectiveness metrics.
Information Processing and Management

Suggestions

Characterizing web search queries that match very few or no results
Altıngövde, İsmail Sengör; Cambazoglu, Berkant Barla; Ozcan, Rifat; Sarigil, Erdem; Ulusoy, Özgür (2012-12-19)
Despite the continuous efforts to improve the web search quality, a non-negligible fraction of user queries end up with very few or even no matching results in leading web search engines. In this work, we provide a detailed characterization of such queries based on an analysis of a real-life query log. Our experimental setup allows us to characterize the queries with few/no results and compare the mechanisms employed by the major search engines in handling them.
Supervised learning for image search result diversification
Göynük, Burak; Altıngövde, İsmail Sengör; Department of Computer Engineering (2019)
Due to ambiguity of user queries and growing size of data living on the internet, methods for diversifying search results have gained more importance lately. While earlier works mostly focus on text search, a similar need also exists for image data, which grows rapidly as people produce and share image data via their smartphones and social media applications such as Instagram, Snapchat, and Facebook. Therefore, in this thesis, we focus on the result diversification problem for image search. To this end, as o...
Exploring the relationship between web presence and web usability for universities A case study from Turkey
Peker, Serhat; Kucukozer-Cavdar, Seyma; Çağıltay, Kürşat (Emerald, 2016-01-01)
Purpose - The purpose of this paper is to statistically explore the relationship between web usability and web presence of the universities. As a case study, five Turkish universities in different rankings which were selected from Webometrics rankings were evaluated and compared.
A framework for aspect based sentiment analysis on turkish informal texts
Karagöz, Pınar; ÖZTÜRK, MURAT; Toroslu, İsmail Hakkı (Springer Science and Business Media LLC, 2019-12-01)
The web provides a suitable media for users to share opinions on various topics, including consumer products, events or news. In most of such content, authors express different opinions on different features (i.e., aspects) of the topic. It is a common practice to express a positive opinion on one aspect and a negative opinion on another aspect within the same post. Conventional sentiment analysis methods do not capture such details, rather an overall sentiment score is generated. In aspect based sentiment ...
Advanced methods for diversification of results in general-purpose and specialized search engines
Yiğit Sert, Sevgi; Altıngövde, İsmail Sengör; Ulusoy, Özgür; Department of Computer Engineering (2020-12-28)
Diversifying search results is a common mechanism in information retrieval to satisfy more users by surfacing documents that address different possible intentions of users. It aims to generate a result list that is both relevant and diverse when ambiguous and/or broad queries appear. Such queries have different underlying subtopics (a.k.a., aspects or interpretations) that search result diversification algorithms should consider. In this thesis, we first address search result diversification as a useful met...
Citation Formats
S. Yigit-Sert, İ. S. Altıngövde, C. Macdonald, I. Ounis, and Ö. Ulusoy, “Supervised approaches for explicit search result diversification,” Information Processing and Management, pp. 0–0, 2020, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/56319.