Modeling phoneme durations and fundamental frequency contours in Turkish speech

Download
2005
Öztürk, Özlem
The term prosody refers to characteristics of speech such as intonation, timing, loudness, and other acoustical properties imposed by physical, intentional and emotional state of the speaker. Phone durations and fundamental frequency contours are considered as two of the most prominent aspects of prosody. Modeling phone durations and fundamental frequency contours in Turkish speech are studied in this thesis. Various methods exist for building prosody models. State-of-the-art is dominated by corpus-based methods. This study introduces corpus-based approaches using classification and regression trees to discover the relationships between prosodic attributes and phone durations or fundamental frequency contours. In this context, a speech corpus, designed to have specific phonetic and prosodic content has been recorded and annotated. A set of prosodic attributes are compiled. The elements of the set are determined based on linguistic studies and literature surveys. The relevances of prosodic attributes are investigated by statistical measures such as mutual information and information gain. Fundamental frequency contour and phone duration modeling are handled as independent problems. Phone durations are predicted by using regression trees where the set of prosodic attributes is formed by forward selection. Quantization of phone durations is studied to improve prediction quality. A two-stage duration prediction process is proposed for handling specific ranges of phone duration values. Scaling and shifting of predicted durations are proposed to minimize mean squared error. Fundamental frequency contour modeling is studied under two different frameworks. One of them generates a codebook of syllable-fundamental-frequency-contours by vector quantization. The codewords are used to predict sentence fundamental frequency contours. Pitch accent prediction by two different

Suggestions

Multi-transducer ultrasonic communication
Ersagun, Erdem; Yılmaz, Ali Özgür; Department of Electrical and Electronics Engineering (2009)
RF and acoustic communications are widely used in terrestrial and underwater environments, respectively. This thesis examines the use of ultrasonic communication alternately in terrestrial applications. We first investigate the ultrasonic channel in order to observe whether reliable communication is possible among the ultrasonic nodes as an alternative to RF-based communications. Some key characteristics of the single-input-single-output (SISO) and single-inputmultiple- output (SIMO) ultrasonic channel are ...
Identification of electromagnetic scattering mechanisms by two dimensional windowed fourier transform approach
Germeç, K. Egemen; Kuzuoğlu, Mustafa; Department of Electrical and Electronics Engineering (2004)
In this thesis, it is demonstrated that the two-dimensional Windowed Fourier Transform (WFT) can be effectively used to analyze the local spectral characteristics of electromagnetic scattering signals in the two-dimensional spatial frequency domain. The WFT is the extension of the Short Time Fourier Transform (STFT), which was originally derived to analyze the local spectral characteristics of one dimensional time functions. Since the WFT focuses on the local spectral behavior of the scattered field, the si...
Analysis of conventional low voltage power line communication methods for automatic meter reading and the classification and experimental verification of noise types for low voltage power line communication network
Danışman, Batuhan; Sevaioğlu, Osman; Department of Electrical and Electronics Engineering (2009)
In this thesis, the conventional low voltage power line communication methods is investigated in the axis of automated meter reading applications and the classification and experimental verification of common noise types for low voltage power line communication network. The investigated system provides the real time transmission of electricity consumption data recorded by electricity meters, initially to a local computer via a low voltage line through a low speed PLC (Power Line Carrier) environment and sub...
Design and realization of broadband instantaneous frequency discriminator
Pamuk, Gökhan; Yıldırım, Nevzat; Department of Electrical and Electronics Engineering (2010)
n this thesis, RF sections of a multi tier instantaneous frequency measurement (IFM) receiver which can operate in 2 – 18 GHz frequency band is designed, simulated and partially realized. The designed structure uses one coarse tier, three medium tiers and one fine tier for frequency discrimination. A novel reflective phase shifting technique is developed which enables the design of very wideband phase shifters using stepped cascaded transmission lines. Compared to the classical phase shifters using coupled ...
Multiview 3d reconstruction of a scene containing independently moving objects
Tola, Engin; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2005)
In this thesis, the structure from motion problem for calibrated scenes containing independently moving objects (IMO) has been studied. For this purpose, the overall reconstruction process is partitioned into various stages. The first stage deals with the fundamental problem of estimating structure and motion by using only two views. This process starts with finding some salient features using a sub-pixel version of the Harris corner detector. The features are matched by the help of a similarity and neighbo...
Citation Formats
Ö. Öztürk, “Modeling phoneme durations and fundamental frequency contours in Turkish speech,” Ph.D. - Doctoral Program, Middle East Technical University, 2005.