Automatic Usage Disambiguation of the Enclitic dA in Turkish

2022-8
Ersöyleyen, Elif Ebru
Discourse is composed of several constituents that yield coherency in a structural form. One of the interesting aspects of discourse is discourse connectives and their contribution to discourse structure. They are lexico-syntactic elements that signal a semantic relation between two discourse units (clauses and sentences). Clitics are morphemes that are phonologically dependent on the lexical item to which they are attached, but have separate syntactic forms, and carry no meaning by themselves. They can function as a discourse connective in several languages; for example in Cuzco Quechua the clitic pas and in Turkish, dA can signal multiple senses, and have features that distinguish them from affixes and other words. dA is essentially a focus-associated enclitic that also has discourse functions in Turkish, conveying contrast, addition, causal and condition senses. In other words, just like other linguistic expressions, dA is subject to ambiguity and creates a challenge in natural language automatization tasks. The aim of this study is two-fold: (a) to analyze the linguistic behavior of dA, annotating its discourse and non-discourse occurrences in corpora of written Turkish, (b) to develop machine learning models that distinguish its discourse usage from its non-discourse usage - i.e., its discourse connective vs. focus enclitic role. The thesis describes the annotation study and the machine learning models, which uses linguistic features. The results of our machine learning experiments show that we can disambiguate the discourse usage of dA with an F1-score of 0.83 in free texts.

Suggestions

Automatic disambiguation of Turkish discourse connectives based on a Turkish connective lexicon
Başıbüyük, Kezban; Doğru, Ali Hikmet; Zeyrek Bozşahin, Deniz; Department of Computer Engineering (2021-8-31)
In this thesis, we developed methods for disambiguating the discourse usage and sense of connectives in a given free Turkish text. For this purpose, we firstly built a comprehensive Turkish Connective Lexicon (TCL) including all types of connectives in Turkish together with their syntactic and semantic features. This lexicon is built automatically by using the discourse relation annotations in several discourse annotated corpora developed for Turkish and follows the format of the German connective lexicon, ...
Automatic sense prediction of explicit discourse connectives in Turkish with the help of centering theory and morphosyntactic features
Çetin, Savaş; Zeyrek Bozşahin, Deniz; Department of Cognitive Sciences (2018)
Discourse connectives (and, but, however) are one of many means of keeping the discourse coherent. Discourse connectives are classified into groups based on their senses (expansion, contingency, etc.). They describe the semantic relationship of two discourse units. This study aims to build a machine learning system to predict the sense of explicit discourse connectives on the Turkish Discourse Bank data, which is manually gold-annotated. To do so, this study examines the effect of several features: i.e. tra...
Use of GIS as a supporting tool for environmental risk assessment and emergency response plans
Girgin, Serkan; Ünlü, Kahraman; Yetiş, Ülkü ( Kluwer Academic Publishers, 2004-05-01)
Decision making in environmental projects is typically a complex and confusing process characterized by trade-offs between socio-political, environmental, and economic impacts. Comparative Risk Assessment (CRA) is a methodology applied to facilitate decision making when various activities compete for limited resources. CRA has become an increasingly accepted research tool and has helped to characterize environmental profiles and priorities on the regional and national level. CRA may be considered as part of...
Automatic sense prediction of implicit discourse relations in Turkish
Kurfalı, Murathan; Zeyrek Bozşahin, Deniz; Department of Cognitive Sciences (2016)
In discourse parsing, the sense prediction of the Implicit discourse relations poses the most significant challenge. The thesis aims to develop a supervised system to predict the sense of implicit discourse relations in Turkish Discourse Bank (TDB). In order to accomplish that goal, the discourse level annotations obtained from TDB are used. TDB follows the PDTB-2’s sense hierarchy and for all experiments within the current study, only CLASS senses are considered. As the primary experiment, the classifiers ...
An investigation of Turkish static spatial semantics in terms of lexical variety: an eye tracking study
Ertekin, Şeyma Nur; Acartürk, Cengiz; Department of Cognitive Sciences (2021-8)
The semantics of spatial terms has been attracting the attention of researchers for the past several decades. As an understudied language, Turkish presents an appropriate test bed for studying the generalizability of semantic characterization of spatial terms across languages. Turkish also exhibits unique characteristics, such as the use of locative case markers and being an agglutinative language. The present study reports an eye-tracking investigation of comprehension of spatial terms in Turkish by employ...
Citation Formats
E. E. Ersöyleyen, “Automatic Usage Disambiguation of the Enclitic dA in Turkish,” M.S. - Master of Science, Middle East Technical University, 2022.