Turkish large vocabulary continuous speech recognition by using limited audio corpus

Download
2012
Susman, Derya
Speech recognition in Turkish Language is a challenging problem in several perspectives. Most of the challenges are related to the morphological structure of the language. Since Turkish is an agglutinative language, it is possible to generate many words from a single stem by using suffixes. This characteristic of the language increases the out-of-vocabulary (OOV) words, which degrade the performance of a speech recognizer dramatically. Also, Turkish language allows words to be ordered in a free manner, which makes it difficult to generate robust language models. In this thesis, the existing models and approaches which address the problem of Turkish LVCSR (Large Vocabulary Continuous Speech Recognition) are explored. Different recognition units (words, morphs, stem and endings) are used in generating the n-gram language models. 3-gram and 4-gram language models are generated with respect to the recognition unit. Since the solution domain of speech recognition is involved with machine learning, the performance of the recognizer depends on the sufficiency of the audio data used in acoustic model training. However, it is difficult to obtain rich audio corpora for the Turkish language. In this thesis, existing approaches are used to solve the problem of Turkish LVCSR by using a limited audio corpus. We also proposed several data selection approaches in order to improve the robustness of the acoustic model.

Suggestions

On lexicon creation for turkish LVCSR
Kadri, Hacıoğlu; Bryan, Pellom; Çiloğlu, Tolga; Öztürk, Özlem; Mikko, Kurimo; Mathias, Creutz (null; 2003-09-14)
In this paper, we address the lexicon design problem in Turkish large vocabulary speech recognition. Although we focus only on Turkish, the methods described here are general enough that they can be considered for other agglutinative languages like Finnish, Korean etc. In an agglutinative language, several words can be created from a single root word using a rich collection of morphological rules. So, a virtually infinite size lexicon is required to cover the language if words are used as the basic units. T...
Turkish speech corpora and recognition tools developed by porting SONIC: Towards multilingual speech recognition
Salor, Ozgul; Pellom, Bryan L.; Çiloğlu, Tolga; Demirekler, Mubeccel (2007-10-01)
This paper presents work on developing speech corpora and recognition tools for Turkish by porting SONIC, a speech recognition tool developed initially for English at the Center for Spoken Language Research of the University of Colorado at Boulder. The work presented in this paper had two objectives: The first one is to collect a standard phonetically-balanced Turkish microphone speech corpus for general research use. A 193-speaker triphone-balanced audio corpus and a pronunciation lexicon for Turkish have ...
Language modelling for Turkish as an agglutinative language
Çiloğlu, Tolga; Sahin, S (2004-04-30)
Two types of language models have been considered for Turkish continuous speech recogniton. In one case words are seperated into their stems and their rest, and language models are calculated based on this new set of units. In the other case words are considered as a whole but language models are calculated with respect to the stems of the words. Studies are carried out for bi-gram and tri-gram formalisms.
Turkish indefinites and accusative marking
Özge, Umut (2011-01-01)
The paper addresses the issue of the effects of overt accusative (Acc) vs. zero (∅) marking on the interpretation of indefinite direct objects in Turkish. Previous attacks on the issue came up with various associations of the overt accusative case with certain semantic and pragmatic categories. A representative list goes as follows: Discourse-linking (Nilsson 1985; Enç 1991; Zidani-Eroğlu 1997), “specificity” (von Heusinger 2002; von Heusinger and Kornfilt 2005), presuppositionality (Kennelly 1997; Kelepir ...
The McGurk Illusion in Turkish
Erdener, Dogu (2015-12-01)
The McGurk effect has been used as an index of auditory-visual speech integration and explored in a variety of languages such as Japanese, English, German and Spanish. However, most languages still to be remain unchartered territories in auditory-visual speech perception research and Turkish is one of them. In this preliminary study, the status of McGurk effect among native Turkish speakers has been looked at using native (Turkish) and non-native (English) McGurk stimuli as well as visual-only (VO; lipreadi...
Citation Formats
D. Susman, “Turkish large vocabulary continuous speech recognition by using limited audio corpus,” M.S. - Master of Science, Middle East Technical University, 2012.