Vocal tract resonances tracking based on voiced and unvoiced speech classification using dynamic programming and fixed interval Kalman smoother

Date

2008-04-04

Author

Oezbek, I. Yuecel
Demirekler, Mübeccel

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

196
views

0
downloads

This paper presents a systematic framework for accurate estimation of vocal tract resonances (formants) using neither training data nor a phonetic transcription. In the proposed method, the speech signal is segmented in voiced and unvoiced parts and the resonance frequencies of the vocal tract are estimated by dynamic programming and further processed by using Kalman filtering/smoothing for each part. The performance of the proposed method is compared with three different methods which are baseline, WaveSurfer [10] and MSR [5]. The proposed method reduces the overall vocal tract resonances (for F1, F2 and F3) estimation error rate by 35%, 39.6% and 2.74% over the baseline, WaveSurfer and MSR methods respectively.

Subject Keywords

Formant tracking, Voiced and unvoiced speech classification , Kalman filtering/smoothing, VTR, Vocal tract resonances

URI

https://hdl.handle.net/11511/56714

DOI

https://doi.org/10.1109/icassp.2008.4518585

Collections

Graduate School of Natural and Applied Sciences, Conference / Seminar

Suggestions

OpenMETU
Core

Tracking of Visible Vocal Tract Resonances (VVTR) Based on Kalman Filtering Özbek Arslan, Işıl; Demirekler, Mübeccel (2006-01-01) This paper analyzes vocal tract resonance (VTR) frequency trajectories and their relationship to formants from a new point of view. Considering abrupt/continuous changes in the physical geometry of vocal tract, VTR may change in number, suddenly change their positions or may leak to some regions where they usually do not exist. We define the visible VTR (VVTR) as VTR that can be seen from the spectrogram. So we propose an algorithm, based on Kalman filtering, that can handle all these changes in VVTR. The s...
Speaker identification by combining multiple classifiers using Dempster-Shafer theory of evidence Altincay, H; Demirekler, Mübeccel (2003-11-01) This paper presents a multiple classifier approach as an alternative solution to the closed-set text-independent speaker identification problem. The proposed algorithm which is based on Dempster-Shafer theory of evidence computes the first and Rth level ranking statistics. Rth level confusion matrices extracted from these ranking statistics are used to cluster the speakers into model sets where they share set specific properties. Some of these model sets are used to reflect the strengths and weaknesses of t...
Wavelet packet based analysis of sound fields in rooms using coincident microphone arrays Günel Kılıç, Banu; Hacıhabiboğlu, Hüseyin (2007-07-01) This paper presents a passive analysis method for determining the spatio-temporal characteristics of sound fields in small rooms. The analysis finds an approximate directional reflectogram (ADR) which reveals the approximate arrival directions, time delays and amplitudes of the direct sound and early reflections without using a special or known sound source. A coincident microphone array is used to obtain directional recordings. The recordings are analysed by wavelet packet decomposition to determine the di...
Analysis and Design of Multichannel Systems for Perceptual Sound Field Reconstruction De Sena, Enzo; Hacıhabiboğlu, Hüseyin; Cvetkovic, Zoran (2013-08-01) This paper presents a systematic framework for the analysis and design of circular multichannel surround sound systems. Objective analysis based on the concept of active intensity fields shows that for stable rendition of monochromatic plane waves it is beneficial to render each such wave by no more than two channels. Based on that finding, we propose a methodology for the design of circular microphone arrays, in the same configuration as the corresponding loudspeaker system, which aims to capture inter-cha...
An Industrially useful means for decomposing and differentiation of harmonics components of periodic waveforms Dölen, Melik (Institute of Electrical and Electronics Engineers (IEEE); 2000-10-12) This paper presents efficient methods to estimate the spectral content of (noisy) periodic waveforms that are common in industrial processes. The techniques presented, which are based on the recursive discrete Fourier transform, are especially useful in computing high-order derivatives of such waveforms. Unlike conventional differentiating techniques, the methods presented differentiate in the frequency domain and thus are quite immune to uncorrelated measurement noise. This paper also shows the theoretical...

Citation Formats

I. Y. Oezbek and M. Demirekler, “Vocal tract resonances tracking based on voiced and unvoiced speech classification using dynamic programming and fixed interval Kalman smoother,” 2008, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/56714.