On the Additivity and Weak Baselines for Search Result Diversification Research

Akcay, Mehmet
Altıngövde, İsmail Sengör
Macdonald, Craig
Ounis, Iadh
A recent study on the topic of additivity addresses the task of search result diversi cation and concludes that while weaker baselines are almost always signi cantly improved by the evaluated diversi cation methods, for stronger baselines, just the opposite happens, i.e., no signi cant improvement can be observed. Due to the importance of the issue in shaping future research directions and evaluation strategies in search results diversi cation, in this work, we rst aim to reproduce the ndings reported in the previous study, and then investigate its possible limitations. Our extensive experiments rst reveal that under the same experimental se ing with that previous study, we can reach similar results. Next, we hypothesize that for stronger baselines, tuning the parameters of some methods (i.e., the trade-o parameter between the relevance and diversity of the results in this particular scenario) should be done in a more negrained manner. With trade-o parameters that are speci cally determined for each baseline run, we show that the percentage of signi cant improvements even over the strong baselines can be doubled. As a further issue, we discuss the possible impact of using the same strong baseline retrieval function for the diversity computations of the methods. Our takeaway message is that in the case of a strong baseline, it is more crucial to tune the parameters of the diversi cation methods to be evaluated; but once this is done, additivity is achievable.