Scalable and Efficient Web Search Result Diversification

Date

2016-08-01

Author

Naini, Kaweh Djafari
Altıngövde, İsmail Sengör
SİBERSKİ, wolf

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

60
views

0
downloads

It has been shown that top-k retrieval quality can be considerably improved by taking not only relevance but also diversity into account. However, currently proposed diversification approaches have not put much attention on practical usability in large-scale settings, such as modern web search systems. In this work, we make two contributions toward this goal. First, we propose a combination of optimizations and heuristics for an implicit diversification algorithm based on the desirable facility placement principle, and present two algorithms that achieve linear complexity without compromising the retrieval effectiveness. Instead of an exhaustive comparison of documents, these algorithms first perform a clustering phase and then exploit its outcome to compose the diverse result set. Second, we describe and analyze two variants for distributed diversification in a computing cluster, for large-scale IR where the document collection is too large to keep in one node. Our contribution in this direction is pioneering, as there exists no earlier work in the literature that investigates the effectiveness and efficiency of diversification on a distributed setup. Extensive evaluations on a standard TREC framework demonstrate a competitive retrieval quality of the proposed optimizations to the baseline algorithm while reducing the processing time by more than 80% and up to 97%, and shed light on the efficiency and effectiveness tradeoffs of diversification when applied on top of a distributed architecture.

Subject Keywords

Computer Networks and Communications

URI

https://hdl.handle.net/11511/38801

Journal

ACM TRANSACTIONS ON THE WEB

DOI

https://doi.org/10.1145/2907948

Collections

Department of Computer Engineering, Article

Suggestions

OpenMETU
Core

Wireless Healthcare Monitoring with RFID-Enhanced Video Sensor Networks Alemdar, Hande; Ersoy, Cem (SAGE Publications, 2010-01-01) In pervasive healthcare systems, WSNs provide rich contextual information and alerting mechanisms against odd conditions with continuous monitoring. Furthermore, they minimize the need for caregivers and help the chronically ill and elderly to survive an independent life. In this paper, we propose an outdoor monitoring environment and evaluate the capabilities of video sensor networks for healthcare monitoring in an outdoor setting. The results exhibit that their capabilities are limited. For this reason, w...
A Shrinkage Approach for Modeling Non-Stationary Relational Autocorrelation Angın, Pelin (2008-12-19) Recent research has shown that collective classification in relational data often exhibit significant performance gains over conventional approaches that classify instances individually. This is primarily due to the presence of autocorrelation in relational datasets, meaning that the class labels of related entities are correlated and inferences about one instance can be used to improve inferences about linked instances. Statistical relational learning techniques exploit relational autocorrelation by modeli...
Scanpath Trend Analysis on Web Pages: Clustering Eye Tracking Scanpaths Eraslan, Sukru; Yesilada, Yeliz; Harper, Simon (Association for Computing Machinery (ACM), 2016-12-01) Eye tracking studies have widely been used in improving the design and usability of web pages and in the research of understanding how users navigate them. However, there is limited research in clustering users' eye movement sequences (i.e., scanpaths) on web pages to identify a general direction they follow. Existing research tends to be reductionist, which means that the resulting path is so short that it is not useful. Moreover, there is little work on correlating users' scanpaths with visual elements of...
Second-order experimental designs for simulation metamodeling Batmaz, İnci (SAGE Publications, 2002-12-01) The main purpose of this study is to compare the performance of a group of second-order designs such as Box-Behnken, face-center cube, three-level factorial, central composite, minimum bias, and minimum variance plus bias for estimating a quadratic metamodel. A time-shared computer system is used to demonstrate the ability of the designs in providing good fit of the metamodel to the simulation response. First, for various numbers of center runs, these designs are compared with respect to their efficiency, r...
Optimized Unmanned Aerial Vehicles Deployment for Static and Mobile Targets' Monitoring Al-Turjman, Fadi; Zahmatkesh, Hadi; Al-Oqily, Ibrhaim; Daboul, Reda (Elsevier BV, 2020-01-01) In the recent decade, drones or Unmanned Aerial Vehicles (UAVs) are getting increasing attention by both industry and academia. Due to the support of advanced technologies, they might be soon an integral part of any smart-cities related project. In this paper, we propose a cost-effective framework related to the optimal placement of drones in order to monitor a set of static and/or dynamic targets in the IoT era. The main objective of this study is to minimize the total number of drones required to monitor ...

Citation Formats

K. D. Naini, İ. S. Altıngövde, and w. SİBERSKİ, “Scalable and Efficient Web Search Result Diversification,” ACM TRANSACTIONS ON THE WEB, pp. 0–0, 2016, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/38801.