Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Improving document ranking with query expansion based on bert word embeddings
Download
index.pdf
Date
2020
Author
Yeke, Doğuhan
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
379
views
368
downloads
Cite This
In this thesis, we present a query expansion approach based on contextualized word embeddings for improving document ranking performance. We employ Bidirectional Encoder Representations from Transformers(BERT) word embeddings to expand the original query with semantically similar terms. After deciding the best method for extracting word embeddings from BERT, we extend our query with the best candidate terms. As our primary goal, we show how BERT performs over the Word2Vec model, known as the most common procedure for representing terms in the vector space. After that, by leveraging the relevance judgment list, we show positive contributions of integrating tf-idf and term co-occurrence properties of terms to our query expansion system. Our experiments demonstrate that BERT outperforms Word2Vec in well-known evaluation metrics. In addition, we also conduct several experiments that address the most popular issues in information retrieval systems.
Subject Keywords
Querying (Computer science).
,
Keywords: Query Expansion
,
BERT
,
Document Ranking
,
Relevance Feedback.
URI
http://etd.lib.metu.edu.tr/upload/12625500/index.pdf
https://hdl.handle.net/11511/45743
Collections
Graduate School of Natural and Applied Sciences, Thesis
Suggestions
OpenMETU
Core
Using object-oriented materialized views to answer selection-based complex queries
Alhajj, R; Polat, Faruk (1999-09-01)
Presented in this paper is a model that utilizes existing materialized views to handle a wide range of complex selection-based queries, including linear recursive queries. Such queries are complex because it is almost impossible for naive users to predict the formulation of their predicate expressions. Object variables bound to objects in the result of a query are allowed to appear in the predicate of that query. Also, the predicate definition is extended to make it possible to have in the output only a sub...
Improving the performance of Hadoop/Hive by sharing scan and computation tasks
Özal, Serkan; Toroslu, İsmail Hakkı; Doğaç, Asuman; Department of Computer Engineering (2013)
MapReduce is a popular model of executing time-consuming analytical queries as a batch of tasks on large scale data. During simultaneous execution of multiple queries, many oppor- tunities can arise for sharing scan and/or computation tasks. Executing common tasks only once can reduce the total execution time of all queries remarkably. Therefore, we propose to use Multiple Query Optimization (MQO) techniques to improve the overall performance of Hadoop Hive, an open source SQL-based distributed warehouse sy...
Semantic information-based alternative plan generation for multiple query optimization
Polat, Faruk; Alhajj, R (Elsevier BV, 2001-09-01)
This paper addresses the impact of semantic information about queries on alternative plan generation (APG) for multiple query optimization (MQO). MQO covers optimizing the execution of a set of queries together where each query in the set to be optimized has several alternative execution plans. A multiple query optimizer selects an alternative plan for each query to obtain an optimal global execution plan. Our approach uses information such as common relations, common possible joins and common conditions to...
A transcoding robust data hiding method for image communication applications
Candan, Çağatay (2005-09-14)
We present a data embedding method for image communication applications. Our goal is to implement novel multimedia applications such as multi-language captions, interactive programming and title specific features over the existing image communication channel. To this aim, we present a data embedding method for JPEG images which has the desired degree of robustness to transcoding or bitrate adjustments that may take place in the communication channel. The described system is designed for JPEG images but can ...
Analyzing the polarity of opinionated queries
Chelaru, Sergiu; Altıngövde, İsmail Sengör; Siersdorfer, Stefan (2012-04-27)
In this paper, we present an in-depth analysis of Web search queries for controversial topics, focusing on query sentiment. To this end, we conduct extensive user assessments as well as an automatic sentiment analysis using the SentiWordNet thesaurus.
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
D. Yeke, “Improving document ranking with query expansion based on bert word embeddings,” Thesis (M.S.) -- Graduate School of Natural and Applied Sciences. Computer Engineering., Middle East Technical University, 2020.