Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Improving the efficiency of distributed information retrieval using hybrid index partitioning
Download
index.pdf
Date
2018
Author
Hafızoğlu, Fatih
Metadata
Show full item record
Item Usage Stats
277
views
101
downloads
Cite This
Selective search with traditional partitioning have advantages over exhaustive search in terms of total query cost. However, it can suffer from query latency and load imbalance for most of the time due to its nature. To overcome these issues, we proposed a new partitioning method in this thesis, namely Hybrid partitioning. Our studies shows that it is possible to obtain significant savings in query latency with this new partitioning methodology. In addition to that, query processing with Hybrid partitioning also achieves perfect load balancing and provides resource optimization, which is a key point for low resource environments.
Subject Keywords
Electronic data processing
,
Querying (Computer science).
,
Information retrieval.
,
Database searching.
URI
http://etd.lib.metu.edu.tr/upload/12622264/index.pdf
https://hdl.handle.net/11511/27572
Collections
Graduate School of Natural and Applied Sciences, Thesis
Suggestions
OpenMETU
Core
Improving the performance of Hadoop/Hive by sharing scan and computation tasks
Özal, Serkan; Toroslu, İsmail Hakkı; Doğaç, Asuman; Department of Computer Engineering (2013)
MapReduce is a popular model of executing time-consuming analytical queries as a batch of tasks on large scale data. During simultaneous execution of multiple queries, many oppor- tunities can arise for sharing scan and/or computation tasks. Executing common tasks only once can reduce the total execution time of all queries remarkably. Therefore, we propose to use Multiple Query Optimization (MQO) techniques to improve the overall performance of Hadoop Hive, an open source SQL-based distributed warehouse sy...
A Cost-Aware Strategy for Query Result Caching in Web Search Engines
Altıngövde, İsmail Sengör; Ulusoy, Oezguer (2009-01-01)
Search engines and large scale IR systems need to cache query results for efficiency and scalability purposes. In this study, we propose to explicitly incorporate the query costs in the static caching policy. To this end, a query’s cost is represented by its execution time, which involves CPU time to decompress the postings and compute the query-document similarities to obtain the final top-N answers. Simulation results using a large Web crawl data and a real query log reveal that the proposed strategy impr...
On the Efficiency of Selective Search
Hafizoglu, Fatih; Kucukoglu, Emre Can; Altıngövde, İsmail Sengör (2017-04-13)
Our work shows that the query latency for selective search over a topically partitioned collection can be reduced by up to 55%. We achieve this by physically storing the documents in each topical cluster across all shards and building a cluster-skipping index at each shard. Our approach also achieves uniform load balance among the shards.
Improving forecasting accuracy of time series data using a new ARIMA-ANN hybrid method and empirical mode decomposition
Buyuksahin, Umit Cavus; Ertekin Bolelli, Şeyda (Elsevier BV, 2019-10-07)
Many applications in different domains produce large amount of time series data. Making accurate forecasting is critical for many decision makers. Various time series forecasting methods exist that use linear and nonlinear models separately or combination of both. Studies show that combining of linear and nonlinear models can be effective to improve forecasting performance. However, some assumptions that those existing methods make, might restrict their performance in certain situations. We provide a new Au...
Using object-oriented materialized views to answer selection-based complex queries
Alhajj, R; Polat, Faruk (1999-09-01)
Presented in this paper is a model that utilizes existing materialized views to handle a wide range of complex selection-based queries, including linear recursive queries. Such queries are complex because it is almost impossible for naive users to predict the formulation of their predicate expressions. Object variables bound to objects in the result of a query are allowed to appear in the predicate of that query. Also, the predicate definition is extended to make it possible to have in the output only a sub...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
F. Hafızoğlu, “Improving the efficiency of distributed information retrieval using hybrid index partitioning,” M.S. - Master of Science, Middle East Technical University, 2018.