Predicting the size of candidate document set for implicit web search result diversification

2020-01-01
© Springer Nature Switzerland AG 2020.Implicit result diversification methods exploit the content of the documents in the candidate set, i.e., the initial retrieval results of a query, to obtain a relevant and diverse ranking. As our first contribution, we explore whether recently introduced word embeddings can be exploited for representing documents to improve diversification, and show a positive result. As a second improvement, we propose to automatically predict the size of candidate set on per query basis. Experimental evaluations using our BM25 runs as well as the best-performing ad hoc runs submitted to TREC (2009–2012) show that our approach improves the performance of implicit diversification up to 5.4% wrt. initial ranking.
Citation Formats
Y. B. Ulu and İ. S. Altıngövde, “Predicting the size of candidate document set for implicit web search result diversification,” 2020, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/58018.