Limitations and improvement opportunities for implicit result diversification in search engines

Ulu, Yaşar Barış
Search engine users essentially expect to find the relevant results for their query. Additionally, the results of the query should contain different possible query intents, which leads to the well-known problem of search result diversification. Our work first investigates the limitations of implicit search result diversification, and in particular, reveals that typical optimization tricks (such as clustering) may not necessarily improve the diversification effectiveness. Then, as our second contribution, we explore whether recently introduced word embeddings can be exploited for representing documents to improve diversification, and show a positive result. Third, as our detailed analysis reveals that the candidate set size plays a critical role for implicit diversification, we propose to automatically predict the size of the candidate set on per query basis. To this end, we use a rich set of features based on the inter-similarity of documents and similarity between queries and documents. Finally, we propose caching similarities of document pairs to improve the processing time efficiency of implicit result diversification.
Citation Formats
Y. B. Ulu, “Limitations and improvement opportunities for implicit result diversification in search engines,” Thesis (M.S.) -- Graduate School of Natural and Applied Sciences. Computer Engineering., Middle East Technical University, 2019.