Two stage blind dereverberation based on stochastic models of speech and reverberation

Download
2019
Kavruk, Mehmet
Distant speech processing is popular nowadays due to wide use of the hands-free communication with smart devices. The quality of microphone signals in an enclosed area is degraded by environmental noise and reverberation in distant speech communication. Although there are powerful denoising algorithms in the literature, there is no robust dereverberation method which works independent of recording conditions. This work proposes a statistical model based blind dereverberation algorithm which suppresses reverberation part without causing serious degradation in the source signal in different speaker to microphone configurations. The proposed algorithm successively uses minimum variance distortionless response (MVDR) and linear prediction methods. The parameters of the MVDR algorithm are estimated using the statistical nature of reverberation. The linear prediction algorithm is applied to the output of MVDR in order to handle residual reverberation. The dereverberation filter in this stage is generated using the statistical models of speech and reverberation. None of the algorithms require any deterministic prior knowledge about the system due to the used statistical models. The experimental results demonstrate that the proposed algorithm suppresses reverberation in the distant recordings without degradation on the source signal with respect to the objective quality measures under different conditions.

Suggestions

Gestures production under instructional context The role of mode of instruction
Melda, Coşkun; Acartürk, Cengiz (Cognitive Science Society ; 2015-09-25)
We aim at examining how communication mode influences the production of gestures under specific contextual environments. Twenty-four participants were asked to present a topic of their choice under three instructional settings: a blackboard, paper-and-pencil, and a tablet. Participants’ gestures were investigated in three groups: deictic gestures that point to entities, representational gestures that present picturable aspects of semantic content, and beat gestures that are speech-related rhythmic hand move...
Design of a context aware security model for preventing relay attacks using NFC enabled mobile devices
Çavdar, Davut; Betin Can, Aysu; Department of Information Systems (2020)
Near Field Communication (NFC) is a promising communication technology used in smart mobile devices. As an effective and flexible communication technology, NFC is frequently used in innovative solutions nowadays such as payment, access control etc. Because of the nature of these transactions, security is an important issue since NFC is used in critical applications such as payment and access control. There are several attacks mentioned in literature against NFC-enabled applications, yet, none of the securit...
Near optimal scheduling for opportunistic spectrum access over block fading channels in cognitive radio assisted vehicular network
Gül, Ömer Melih; Kantarci, Burak (2022-10-01)
© 2022 Elsevier Inc.With the increasing use of cognitive radio technology in vehicular communications, vehicles will be enabled with cognitive radio in the future. Cognitive radio assisted vehicular networks make cognitive radio enabled vehicles utilize licensed spectrum on highways opportunistically. This work tackles cognitive radio assisted vehicular networks including M primary users (transmitter), M primary receivers, a secondary user (transmitter) with K channels and K secondary receivers. A channel i...
SPEECH DETECTION ON BROADCAST AUDIO
Zubari, Unal; Ozan, Ezgi Can; Acar, Banu Oskay; Çiloğlu, Tolga; Esen, Ersin; Ates, Tugrul K.; Onur, Duygu Oskay (2010-08-27)
Speech boundary detection contributes to performance of speech based applications such as speech recognition and speaker recognition. Speech boundary detector implemented in this study works on broadcast audio as a pre-processor module of a keyword spotter. Speech boundary detection is handled in 3 steps. At first step, audio data is segmented into homogeneous regions in an unsupervised manner. After an ACTIVITY/NON-ACTIVITY decision is made for each region, ACTIVITY regions are classified as Speech/Non-spe...
Wireless speech recognition using fixed point mixed excitation linear prediction (MELP) vocoder
Acar, D; Karci, MH; Ilk, HG; Demirekler, Mübeccel (2002-07-19)
A bit stream based front-end for wireless speech recognition system that operates on fixed point mixed excitation linear prediction (MELP) vocoder is presented in this paper. Speaker dependent, isolated word recognition accuracies obtained from conventional and bit stream based front-end systems are obtained and their statistical significance is discussed. Feature parameters are extracted from original (wireline) and decoded speech (conventional) and from the quantized spectral information (bit stream) of t...
Citation Formats
M. Kavruk, “Two stage blind dereverberation based on stochastic models of speech and reverberation,” M.S. - Master of Science, Middle East Technical University, 2019.