Two stage blind dereverberation based on stochastic models of speech and reverberation

Download

index.pdf

Date

2019

Author

Kavruk, Mehmet

Metadata

Show full item record

Item Usage Stats

181
views

51
downloads

Distant speech processing is popular nowadays due to wide use of the hands-free communication with smart devices. The quality of microphone signals in an enclosed area is degraded by environmental noise and reverberation in distant speech communication. Although there are powerful denoising algorithms in the literature, there is no robust dereverberation method which works independent of recording conditions. This work proposes a statistical model based blind dereverberation algorithm which suppresses reverberation part without causing serious degradation in the source signal in different speaker to microphone configurations. The proposed algorithm successively uses minimum variance distortionless response (MVDR) and linear prediction methods. The parameters of the MVDR algorithm are estimated using the statistical nature of reverberation. The linear prediction algorithm is applied to the output of MVDR in order to handle residual reverberation. The dereverberation filter in this stage is generated using the statistical models of speech and reverberation. None of the algorithms require any deterministic prior knowledge about the system due to the used statistical models. The experimental results demonstrate that the proposed algorithm suppresses reverberation in the distant recordings without degradation on the source signal with respect to the objective quality measures under different conditions.

Subject Keywords

Speech perception., Reverberation time., Stochastic processes., Mathematical statistics.

URI

http://etd.lib.metu.edu.tr/upload/12623026/index.pdf
https://hdl.handle.net/11511/27964

Collections

Graduate School of Natural and Applied Sciences, Thesis

Suggestions

OpenMETU
Core

Gestures production under instructional context The role of mode of instruction Melda, Coşkun; Acartürk, Cengiz (Cognitive Science Society ; 2015-09-25) We aim at examining how communication mode influences the production of gestures under specific contextual environments. Twenty-four participants were asked to present a topic of their choice under three instructional settings: a blackboard, paper-and-pencil, and a tablet. Participants’ gestures were investigated in three groups: deictic gestures that point to entities, representational gestures that present picturable aspects of semantic content, and beat gestures that are speech-related rhythmic hand move...
Design of a context aware security model for preventing relay attacks using NFC enabled mobile devices Çavdar, Davut; Betin Can, Aysu; Department of Information Systems (2020) Near Field Communication (NFC) is a promising communication technology used in smart mobile devices. As an effective and flexible communication technology, NFC is frequently used in innovative solutions nowadays such as payment, access control etc. Because of the nature of these transactions, security is an important issue since NFC is used in critical applications such as payment and access control. There are several attacks mentioned in literature against NFC-enabled applications, yet, none of the securit...
Wireless speech recognition using fixed point mixed excitation linear prediction (MELP) vocoder Acar, D; Karci, MH; Ilk, HG; Demirekler, Mübeccel (2002-07-19) A bit stream based front-end for wireless speech recognition system that operates on fixed point mixed excitation linear prediction (MELP) vocoder is presented in this paper. Speaker dependent, isolated word recognition accuracies obtained from conventional and bit stream based front-end systems are obtained and their statistical significance is discussed. Feature parameters are extracted from original (wireline) and decoded speech (conventional) and from the quantized spectral information (bit stream) of t...
SPEECH DETECTION ON BROADCAST AUDIO Zubari, Unal; Ozan, Ezgi Can; Acar, Banu Oskay; Çiloğlu, Tolga; Esen, Ersin; Ates, Tugrul K.; Onur, Duygu Oskay (2010-08-27) Speech boundary detection contributes to performance of speech based applications such as speech recognition and speaker recognition. Speech boundary detector implemented in this study works on broadcast audio as a pre-processor module of a keyword spotter. Speech boundary detection is handled in 3 steps. At first step, audio data is segmented into homogeneous regions in an unsupervised manner. After an ACTIVITY/NON-ACTIVITY decision is made for each region, ACTIVITY regions are classified as Speech/Non-spe...
A novel user activity prediction model for context aware computing systems Peker, Serhat; Koçyiğit, Altan; Department of Information Systems (2011) In the last decade, with the extensive use of mobile electronic and wireless communication devices, there is a growing need for context aware applications and many pervasive computing applications have become integral parts of our daily lives. Context aware recommender systems are one of the popular ones in this area. Such systems surround the users and integrate with the environment; hence, they are aware of the users' context and use that information to deliver personalized recommendations about everyday ...

Citation Formats

M. Kavruk, “Two stage blind dereverberation based on stochastic models of speech and reverberation,” M.S. - Master of Science, Middle East Technical University, 2019.