Segmental duration modelling in Turkish

Date

2006-01-01

Author

Ozturk, Ozlem
Çiloğlu, Tolga

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

33
views

0
downloads

Naturalness of synthetic speech highly depends on appropriate modelling of prosodic aspects. Mostly, three prosody components are modelled: segmental duration, pitch contour and intensity. In this study, we present our work on modelling segmental duration in Turkish using machine-learning algorithms, especially Classification and Regression Trees. The models predict phone durations based on attributes such as current, preceding and following phones' identities, stress, part-of-speech, word length in number of syllables, and position of word in utterance extracted from a speech corpus. Obtained models predict segment durations better than mean duration approximations (similar to 0.77 Correlation Coefficient, and 20.4 ms Root-Mean Squared Error). In order to improve prediction performance further, attributes used to develop segmental duration are optimized by means of Sequential Forward Selection method. As a result of Sequential Forward Selection method, phone identity, neighboring phone identities, lexical stress, syllable type, part-of-speech, phrase break information, and location of word in the phrase constitute optimum attribute set for phoneme duration modelling.

Subject Keywords

Mean absolute error, Pitch contour, Synthetic speech, Speech corpus, Speech database

URI

https://hdl.handle.net/11511/54581

Journal

TEXT, SPEECH AND DIALOGUE, PROCEEDINGS

Collections

Department of Electrical and Electronics Engineering, Article

Suggestions

OpenMETU
Core

Segmental Duration Modeling in Turkish Ozturk, Ozlem; Çiloğlu, Tolga (2006-01-01) Naturalness of synthetic speech highly depends on appropriate modeling of prosodic aspects. Mostly, three prosody components are modeled: segmental duration, pitch contour and intensity. In this study, we present our work on modeling segmental duration in Turkish using machine-learning algorithms, especially Classification and Regression Trees (CART). The models predict phone durations based on attributes such as phone identity, neighboring phone identities, lexical stress, position of syllable in word, par...
Two channel adaptive speech enhancement Zaim, Erman; Çiloğlu, Tolga; Department of Electrical and Electronics Engineering (2014) In this thesis, speech enhancement problem is studied and a speech enhancement system is implemented on TMS320C5505 fixed point DSP. Speech degradation due to the signal leakage into the reference microphone and uncorrelated signals between microphones are studied. Limitations of fixed point implementations are examined. Theoretical complexities of weight adaptation algorithms are examined. Moreover, differences between theoretical and practical complexities of weight adaptation algorithms due to the select...
Language modelling for Turkish as an agglutinative language Çiloğlu, Tolga; Sahin, S (2004-04-30) Two types of language models have been considered for Turkish continuous speech recogniton. In one case words are seperated into their stems and their rest, and language models are calculated based on this new set of units. In the other case words are considered as a whole but language models are calculated with respect to the stems of the words. Studies are carried out for bi-gram and tri-gram formalisms.
Spectral modification for context-free voice conversion using MELP speech coding framework Salor, O; Demirekler, Mübeccel (2004-10-22) In this work, we have focused on spectral modification of speech for voice con version from one speaker to another. Speech conversion aims to modify the speech of one speaker such that the modified speech sounds as if spoken by another speaker. MELP (Mixed Excitation Linear Prediction) speech coding algorithm has been used as speech analysis and synthesis framework. Using a 230-sentence triphone balanced database of the two speakers, a mapping between the 4-stage vector quantization indexes for line spectra...
Completion, pricing and calibration in a Lévy market model Yılmaz, Büşra Zeynep; Hayfavi, Azize; Erol, Işıl; Department of Financial Mathematics (2010) In this thesis, modelling with Lévy processes is considered in three parts. In the first part, the general geometric Lévy market model is examined in detail. As such markets are generally incomplete, it is shown that the market can be completed by enlarging with a series of new artificial assets called “power-jump assets” based on the power-jump processes of the underlying Lévy process. The second part of the thesis presents two different methods for pricing European options: the martingale pricing approach...

Citation Formats

O. Ozturk and T. Çiloğlu, “Segmental duration modelling in Turkish,” TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, pp. 669–676, 2006, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/54581.