Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Language modelling for Turkish as an agglutinative language
Date
2004-04-30
Author
Çiloğlu, Tolga
Sahin, S
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
213
views
0
downloads
Cite This
Two types of language models have been considered for Turkish continuous speech recogniton. In one case words are seperated into their stems and their rest, and language models are calculated based on this new set of units. In the other case words are considered as a whole but language models are calculated with respect to the stems of the words. Studies are carried out for bi-gram and tri-gram formalisms.
Subject Keywords
Natural languages
,
Speech recognition
,
Testing
URI
https://hdl.handle.net/11511/39819
DOI
https://doi.org/10.1109/siu.2004.1338563
Conference Name
IEEE 12th Signal Processing and Communications Applications Conference
Collections
Department of Electrical and Electronics Engineering, Conference / Seminar
Suggestions
OpenMETU
Core
Language modeling for Turkish continuous speech recognition
Şahin, Serkan; Çiloğlu, Tolga; Department of Electrical and Electronics Engineering (2003)
This study aims to build a new language model for Turkish continuous speech recognition. Turkish is very productive language in terms of word forms because of its agglutinative nature. For such languages like Turkish, the vocabulary size is far from being acceptable from only one simple stem, thousands of new words can be generated using inflectional and derivational suffixes. In this work, word are parsed into their stem and endings. First of all, we consider endings as words and we obtained bigram probabi...
Turkish large vocabulary continuous speech recognition by using limited audio corpus
Susman, Derya; Yazıcı, Adnan; Köprü, Selçuk; Department of Computer Engineering (2012)
Speech recognition in Turkish Language is a challenging problem in several perspectives. Most of the challenges are related to the morphological structure of the language. Since Turkish is an agglutinative language, it is possible to generate many words from a single stem by using suffixes. This characteristic of the language increases the out-of-vocabulary (OOV) words, which degrade the performance of a speech recognizer dramatically. Also, Turkish language allows words to be ordered in a free manner, whic...
On lexicon creation for turkish LVCSR
Kadri, Hacıoğlu; Bryan, Pellom; Çiloğlu, Tolga; Öztürk, Özlem; Mikko, Kurimo; Mathias, Creutz (null; 2003-09-14)
In this paper, we address the lexicon design problem in Turkish large vocabulary speech recognition. Although we focus only on Turkish, the methods described here are general enough that they can be considered for other agglutinative languages like Finnish, Korean etc. In an agglutinative language, several words can be created from a single root word using a rich collection of morphological rules. So, a virtually infinite size lexicon is required to cover the language if words are used as the basic units. T...
Head finalization and morphological analysis in factored phrase-based statistical machine translation from English to Turkish
İmren, Haydar; Çakıcı, Ruket; Department of Computer Engineering (2015)
Machine Translation is a field of study which deals with translating text from one natural language to another automatically. Statistical Machine Translation generates the translations using statistical methods and bilingual text corpora. In this study, an approach for translating from English to Turkish is introduced. Turkish is an agglutinative language with a free constituent order, whereas English is not agglutinative and the constituent order is strict. Besides these differences, there is a lack of par...
Frequency-driven late fusion-based word decomposition approach on the phrase-based statistical machine translation systems
Tatlıcıoğlu, Mehmet; Yazıcı, Adnan; Department of Computer Engineering (2013)
Machine translation is the process of translating texts from a natural language to another by computers based on linguistic motivations, statistical approaches, or the combination of them. In this study, the frequency-driven late fusion-based word decomposition approach is introduced to improve the translation quality of the phrase-based statistical machine translation system from Turkish to English. This late fusion-based approach is compared with the standalone statistical and rule-based word decompositio...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
T. Çiloğlu and S. Sahin, “Language modelling for Turkish as an agglutinative language,” presented at the IEEE 12th Signal Processing and Communications Applications Conference, Kusadasi, TURKEY, 2004, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/39819.