Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Efficient discovery of join plans in schemaless data
Download
index.pdf
Date
2009-09-01
Author
Acar, Aybar Can
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
166
views
0
downloads
Cite This
We describe a method of inferring join plans for a set of relation instances, in the absence of any metadata, such as attribute domains, attribute names, or constraints (e.g., keys or foreign keys). Our method enumerates the possible join plans in order of likelihood, based on the compatibility of a pair of columns and their suitability as join attributes (i.e. their appropriateness as keys). We outline two variants of the approach. The first variant is accurate but potentially time-consuming, especially for large relations that do not fit in memory. The second variant is an approximation of the former and hence less accurate, but is considerably more efficient, allowing the method to be used online, even for large relations. We provide experimental results showing how both forms scale in terms of performance as the number of candidate join attributes and the size of the relations increase. We also characterize the accuracy of the approximate variant with respect to the exact variant.
Subject Keywords
Dependency Inference
,
Join inference
,
Schema Matching
URI
https://hdl.handle.net/11511/30683
DOI
https://doi.org/10.1145/1620432.1620434
Collections
Graduate School of Informatics, Conference / Seminar
Suggestions
OpenMETU
Core
Improvement of corpus-based semantic word similarity using vector space model
Esin, Yunus Emre; Alpaslan, Ferda Nur; Department of Computer Engineering (2009)
This study presents a new approach for finding semantically similar words from corpora using window based context methods. Previous studies mainly concentrate on either finding new combination of distance-weight measurement methods or proposing new context methods. The main di fference of this new approach is that this study reprocesses the outputs of the existing methods to update the representation of related word vectors used for measuring semantic distance between words, to improve the results further. ...
Fast, efficient and dynamically optimized data and hardware architectures for string matching
Zengin, Salih; Güran, Hasan Cengiz; Schmidt, Şenan Ece; Department of Electrical and Electronics Engineering (2014)
Many fields of computing such as network intrusion detection employ string matching modules (SMM) that search for a given set of strings in their input. An SMM is expected to produce correct outcomes while scanning the input data at high rates. Furthermore, the string sets that are searched for are usually large and their sizes increase steadily. In this thesis, motivated by the requirement of designing fast, accurate and efficient SMMs; we propose a number of SMM architectures that employ Bloom Filters to ...
Using semantic web services for data integration in banking domain
Okat, Çağlar; Doğru, Ali Hikmet; Department of Computer Engineering (2010)
A semantic model oriented transformation mechanism is developed for the centralization of intra-enterprise data integration. Such a mechanism is especially crucial in the banking domain which is selected in this study. A new domain ontology is constructed to provide basis for annotations. A bottom-up approach is preferred for semantic annotations to utilize existing web service definitions. Transformations between syntactic web service XML responses and semantic model concepts are defined in transformation ...
An index structure for fuzzy databases
Yazıcı, Adnan (1996-09-11)
Fuzzy querying involves more complex processing than ordinary querying does. In addition, a larger number of tuples will possibly be selected by fuzzy conditions compared to the crisp ones. The current index structures are inefficient in representing and dealing with uncertain and fuzzy data. In this paper we extend one of the multi-dimensional data structures, namely Multi Lever Grid File (Whang and Krishnamurty, 1991) for an efficient access to both crisp and fuzzy data. In order to take advantage of the ...
Efficient processing of category-restricted queries for Web directories
Altıngövde, İsmail Sengör; Ulusoy, Oezguer (2008-01-01)
We show that a cluster-skipping inverted index (CS-IIS) is a practical and efficient file structure to support category-restricted queries for searching Web directories. The query processing strategy with CS-IIS improves CPU time efficiency without imposing any limitations on the directory size.
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
A. C. Acar, “Efficient discovery of join plans in schemaless data,” 2009, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/30683.