On the Efficiency of Selective Search

2017-04-13
Hafizoglu, Fatih
Kucukoglu, Emre Can
Altıngövde, İsmail Sengör
Our work shows that the query latency for selective search over a topically partitioned collection can be reduced by up to 55%. We achieve this by physically storing the documents in each topical cluster across all shards and building a cluster-skipping index at each shard. Our approach also achieves uniform load balance among the shards.

Suggestions

Improving the efficiency of distributed information retrieval using hybrid index partitioning
Hafızoğlu, Fatih; Altıngövde, İsmail Sengör; Department of Computer Engineering (2018)
Selective search with traditional partitioning have advantages over exhaustive search in terms of total query cost. However, it can suffer from query latency and load imbalance for most of the time due to its nature. To overcome these issues, we proposed a new partitioning method in this thesis, namely Hybrid partitioning. Our studies shows that it is possible to obtain significant savings in query latency with this new partitioning methodology. In addition to that, query processing with Hybrid partitioning...
Sampling of the Wiener Process for Remote Estimation Over a Channel With Random Delay
Sun, Yin; Polyanskiy, Yury; Uysal, Elif (Institute of Electrical and Electronics Engineers (IEEE), 2020-02-01)
In this paper, we consider a problem of sampling a Wiener process, with samples forwarded to a remote estimator over a channel that is modeled as a queue. The estimator reconstructs an estimate of the real-time signal value from causally received samples. We study the optimal online sampling strategy that minimizes the mean square estimation error subject to a sampling rate constraint. We prove that the optimal sampling strategy is a threshold policy, and find the optimal threshold. This threshold is determ...
Improving the performance of Hadoop/Hive by sharing scan and computation tasks
Özal, Serkan; Toroslu, İsmail Hakkı; Doğaç, Asuman; Department of Computer Engineering (2013)
MapReduce is a popular model of executing time-consuming analytical queries as a batch of tasks on large scale data. During simultaneous execution of multiple queries, many oppor- tunities can arise for sharing scan and/or computation tasks. Executing common tasks only once can reduce the total execution time of all queries remarkably. Therefore, we propose to use Multiple Query Optimization (MQO) techniques to improve the overall performance of Hadoop Hive, an open source SQL-based distributed warehouse sy...
On the efficiency of authentication protocols, digital signatures and their applications in e-health: a top-down approach
Bıçakçı, Kemal; Baykal, Nazife; Department of Information Systems (2003)
Choosing an authentication protocol or a digital signature algorithm becomes more challenging when performance constraints are of concern. In this thesis, we discuss the possible options in a top-down approach and propose viable alternatives for the efficiency criteria. Before all the technical discussions, argue that identifying prerequisites, threats and risks on an organizational conthas utmost importance so that effective solutions can be delivered at a reasonable cost. For instance, one approach to sol...
An index structure for fuzzy databases
Yazıcı, Adnan (1996-09-11)
Fuzzy querying involves more complex processing than ordinary querying does. In addition, a larger number of tuples will possibly be selected by fuzzy conditions compared to the crisp ones. The current index structures are inefficient in representing and dealing with uncertain and fuzzy data. In this paper we extend one of the multi-dimensional data structures, namely Multi Lever Grid File (Whang and Krishnamurty, 1991) for an efficient access to both crisp and fuzzy data. In order to take advantage of the ...
Citation Formats
F. Hafizoglu, E. C. Kucukoglu, and İ. S. Altıngövde, “On the Efficiency of Selective Search,” 2017, vol. 10193, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/39082.