Idioms as multi-word expressions in Turkish

Güven, Arzu Burcu
Idioms constitute several challenges for both Natural Language Processing (NLP) and linguistic analysis. A better understanding of idioms will yield valuable insights about natural language as well as the way it is processed. The relevance of idioms, along with the fact that Turkish is a rather unexplored language from this perspective, motivates us to work on Turkish idioms. Here, we aim to demonstrate a grammatical study on Turkish idioms that were selected in accordance with distributional models.


An examination of quantifier scope ambiguity in Turkish
Kurt, Kürşad; Bozşahin, Hüseyin Cem; Department of Cognitive Sciences (2006)
This study investigates the problem of quantifier scope ambiguity in natural languages and the various ways with which it has been accounted for, some of which are problematic for monotonic theories of grammar like Combinatory Categorial Grammar (CCG) which strive for solutions that avoid non-monotonic functional application, and assume complete transparency between the syntax and the semantics interface of a language. Another purpose of this thesis is to explore these proposals on examples from Turkish and...
Paracompositionality, MWEs and argument substitution
Bozşahin, Hüseyin Cem (2018-08-11)
Multi-word expressions, verb-particle constructions, idiomatically combining phrases, and phrasal idioms have something in common: not all of their elements contribute to the argument structure of the predicate implicated by the expression. Radically lexicalized theories of grammar that avoid string-, term-, logical form-, and tree-writing, and categorial grammars that avoid wrap operation, make predictions about the categories involved in verb-particles and phrasal idioms. They may require singleton types,...
Expanding horizons of cross-linguistic research on reading: The Multilingual Eye-movement Corpus (MECO)
Siegelman, Noam; et. al. (2022-02-01)
Scientific studies of language behavior need to grapple with a large diversity of languages in the world and, for reading, a further variability in writing systems. Yet, the ability to form meaningful theories of reading is contingent on the availability of cross-linguistic behavioral data. This paper offers new insights into aspects of reading behavior that are shared and those that vary systematically across languages through an investigation of eye-tracking data from 13 languages recorded during text rea...
Twitter Sentiment Analysis Experiments Using Word Embeddings on Datasets of Various Scales
Arslan, Yusuf; Kucuk, Dilek; Birtürk, Ayşe Nur (2018-06-15)
Sentiment analysis is a popular research topic in social media analysis and natural language processing. In this paper, we present the details and evaluation results of our Twitter sentiment analysis experiments which are based on word embeddings vectors such as word2vec and doc2vec, using an ANN classifier. In these experiments, we utilized two publicly available sentiment analysis datasets and four smaller datasets derived from these datasets, in addition to a publicly available trained vector model over ...
Özdemir, Gizem Nur; Bozşahin, Hüseyin Cem; Department of Bioinformatics (2021-7-14)
In agglutinating languages such as Turkish, the process of derivation is mostly performed by adding suffixes at the end of words. Most of the derivational suffixes carry a distinctive semantic content and representing them has an important role in computational tasks, such as question answering. In this thesis, we aim to explore the structure of some frequent Turkish derivational suffixes in distributional vector space by clustering word embedding vectors of them and analyzing their underlying semantic prop...
