Head finalization and morphological analysis in factored phrase-based statistical machine translation from English to Turkish

Download
2015
İmren, Haydar
Machine Translation is a field of study which deals with translating text from one natural language to another automatically. Statistical Machine Translation generates the translations using statistical methods and bilingual text corpora. In this study, an approach for translating from English to Turkish is introduced. Turkish is an agglutinative language with a free constituent order, whereas English is not agglutinative and the constituent order is strict. Besides these differences, there is a lack of parallel corpora for this language pair which makes SMT a challenging problem. Up to now, most of the work and research done for this language pair suggest representing the languages at the morpheme-level. The difference of this study is not only representing English and Turkish at morpheme-level but also applying a different reordering technique which was successfully used for other languages, which are grammatically similar to Turkish. The technique is called Head Finalization. To report the results of this study, BLEU metric is used. With improvements in reordering and morpheme-level representation, we have increased our BLEU score from a baseline score of 19.62 to 30.93, which corresponds to an increase of 57%. The experiments can be successfully applied to other languages which are close to Turkish in terms of word order, morphological structure and suffixation.

Suggestions

Frequency-driven late fusion-based word decomposition approach on the phrase-based statistical machine translation systems
Tatlıcıoğlu, Mehmet; Yazıcı, Adnan; Department of Computer Engineering (2013)
Machine translation is the process of translating texts from a natural language to another by computers based on linguistic motivations, statistical approaches, or the combination of them. In this study, the frequency-driven late fusion-based word decomposition approach is introduced to improve the translation quality of the phrase-based statistical machine translation system from Turkish to English. This late fusion-based approach is compared with the standalone statistical and rule-based word decompositio...
The Second language processing of nominal compounds: a masked priming study
Çelikkol Berk, Nurten; Kırkıcı, Bilal; Department of English Language Teaching (2018)
The primary purpose of the present study was to understand the workings of the cognitive mechanisms underlying L2 morphological processing, and more particularly, to explore how noun-noun compounds in L2 English are processed by native speakers of Turkish in the earliest stages of word recognition. Furthermore, the study investigated the role of constituent morphemes in the processing of compound words and examined whether or not a compound word primes its first and second constituents equally. The final pu...
Morphological processing of inflected and derived words in L1 Turkish and L2 English
Şafak, Duygu Fatma; Kırkıcı, Bilal; Department of English Language Teaching (2015)
The present study aims at examining how inflected and derived words are processed during the early stages of visual word recognition in a native language (L1) and in a second language (L2). A second aim of the study is to find out whether or not the semantic and surface-form properties of morphologically complex words affect early word recognition processes. Two masked priming experiments were conducted to investigate morphological processing in L1 Turkish and in L2 English. In the first experiment, 40 L1 s...
Second language acquisition of the English article system by Turkish learners: the role of semantic notions
Atay, Zeynep; Zeyrek Bozşahin, Deniz; Department of English Language Teaching (2010)
This thesis investigates the second language acquisition of the English article system by Turkish learners in order to find out the role of certain semantic universals of the Universal Grammar during the acquisition process. More specifically, the purpose is to see whether or not L1 Turkish learners of English fluctuate between two semantic notions namely; specificity and definiteness, and the effect of this fluctuation on acquisition. 120 students from three groups of learners at different proficiency leve...
Morphological processing in developing readers: a psycholinguistic study on Turkish primary school children
Uğuz, Enis; Kırkıcı, Bilal; Department of English Literature (2018)
The processing of morphologically complex words has been studies in many languages, leading to a variety of theoretical accounts. While dual-route models advocate two distinct mechanisms for word processing, single route models suggest a single mechanism. Contrasting findings as well as the different interpretations of the same results have kept the advocators of both accounts searching for a solid and undisputable justification for their views. This thesis investigated the early stages of morphological pro...
Citation Formats
H. İmren, “Head finalization and morphological analysis in factored phrase-based statistical machine translation from English to Turkish,” M.S. - Master of Science, Middle East Technical University, 2015.