Improvement of corpus-based semantic word similarity using vector space model

Download
2009
Esin, Yunus Emre
This study presents a new approach for finding semantically similar words from corpora using window based context methods. Previous studies mainly concentrate on either finding new combination of distance-weight measurement methods or proposing new context methods. The main di fference of this new approach is that this study reprocesses the outputs of the existing methods to update the representation of related word vectors used for measuring semantic distance between words, to improve the results further. Moreover, this novel technique provides a solution to the data sparseness of vectors which is a common problem in methods which uses vector space model. The main advantage of this new approach is that it is applicable to many of the existing word similarity methods using the vector space model. The other and the most important advantage of this approach is that it improves the performance of some of these existing word similarity measuring methods.

Suggestions

Natural language query processing in ontology based multimedia databases
Aygül, Filiz Alaca; Çiçekli, Fehime Nihan; Department of Computer Engineering (2010)
In this thesis a natural language query interface is developed for semantic and spatio-temporal querying of MPEG-7 based domain ontologies. The underlying ontology is created by attaching domain ontologies to the core Rhizomik MPEG-7 ontology. The user can pose concept, complex concept (objects connected with an “AND” or “OR” connector), spatial (left, right . . . ), temporal (before, after, at least 10 minutes before, 5 minutes after . . . ), object trajectory and directional trajectory (east, west, southe...
Using semantic web services for data integration in banking domain
Okat, Çağlar; Doğru, Ali Hikmet; Department of Computer Engineering (2010)
A semantic model oriented transformation mechanism is developed for the centralization of intra-enterprise data integration. Such a mechanism is especially crucial in the banking domain which is selected in this study. A new domain ontology is constructed to provide basis for annotations. A bottom-up approach is preferred for semantic annotations to utilize existing web service definitions. Transformations between syntactic web service XML responses and semantic model concepts are defined in transformation ...
Improvement on Corpus-Based Word Similarity Using Vector-Space Models
ESİN, yunus emre; ALAN, özgür; Alpaslan, Ferda Nur (2009-09-16)
This paper presents a new approach for finding semantically similar words from large text collection using window based context methods. Previous studies on this problem mainly concentrate on finding new methods which are new combination of distance-weight measurement methods or new context methods. The main difference of our approach is that we focus on reprocessing of existing methods' outputs to update the representation of related_word vectors, which are used for measuring semantic distance between word...
A test oriented service and object model for software product lines
Parlakol, Nazif Bülent; Karagöz, Pınar; Department of Computer Engineering (2010)
In this thesis, a new modeling technique is proposed for minimizing regression testing effort in software product lines. The “Product Flow Model” is used for the common representation of products in application engineering and the “Domain Service and Object Model” represents the variant based relations between products and core assets. This new approach provides a solution for avoiding unnecessary work load of regression testing using the principles of sub-service decomposition and variant based product/sub...
Smoothing and differentiation of dynamic data
Titrek, Fatih; Tarı, Zehra Sibel; Department of Computer Engineering (2010)
Smoothing is an important part of the pre-processing step in Signal Processing. A signal, which is purified from noise as much as possible, is necessary to achieve our aim. There are many smoothing algorithms which give good result on a stationary data, but these smoothing algorithms don’t give expected result in a non-stationary data. Studying Acceleration data is an effective method to see whether the smoothing is successful or not. The small part of the noise that takes place in the Displacement data wil...
Citation Formats
Y. E. Esin, “Improvement of corpus-based semantic word similarity using vector space model,” M.S. - Master of Science, Middle East Technical University, 2009.