Wireless speech recognition using fixed point mixed excitation linear prediction (MELP) vocoder

Date

2002-07-19

Author

Acar, D
Karci, MH
Ilk, HG
Demirekler, Mübeccel

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

276
views

0
downloads

A bit stream based front-end for wireless speech recognition system that operates on fixed point mixed excitation linear prediction (MELP) vocoder is presented in this paper. Speaker dependent, isolated word recognition accuracies obtained from conventional and bit stream based front-end systems are obtained and their statistical significance is discussed. Feature parameters are extracted from original (wireline) and decoded speech (conventional) and from the quantized spectral information (bit stream) of the MELP vocoder. The recognition accuracies proved that the bit stream based front end gives comparable performance to that obtained from original input speech and is superior than the recognizer obtained from low bit rate decoded speech.

Subject Keywords

Wireless speech recognition, MELP, Speech coding, Fixed-point arithmetic, Speech recognition accuracy

URI

https://hdl.handle.net/11511/55562

Collections

Graduate School of Natural and Applied Sciences, Conference / Seminar

Suggestions

OpenMETU
Core

Bimodal automatic speech segmentation based on audio and visual information fusion Akdemir, Eren; Çiloğlu, Tolga (2011-07-01) Bimodal automatic speech segmentation using visual information together with audio data is introduced. The accuracy of automatic segmentation directly affects the quality of speech processing systems using the segmented database. The collaboration of audio and visual data results in lower average absolute boundary error between the manual segmentation and automatic segmentation results. The information from two modalities are fused at the feature level and used in a HMM based speech segmentation system. A T...
Parallel decodable channel coding implemented on a MIMO testbed Aktaş, Tuğcan; Yılmaz, Ali Özgür; Department of Electrical and Electronics Engineering (2007) This thesis considers the real-time implementation phases of a multiple-input multiple-output (MIMO) wireless communication system. The parts which are related to the implementation detail the blocks realized on a field programmable gate array (FPGA) board and define the connections between these blocks and typical radio frequency front-end modules assisting the wireless communication. Two sides of the implemented communication testbed are discussed separately as the transmitter and the receiver parts. In a...
Nonlinear interactive source-filter model for voiced speech Koç, Turgay; Çiloğlu, Tolga; Department of Electrical and Electronics Engineering (2012) The linear source-filter model (LSFM) has been used as a primary model for speech processing since 1960 when G. Fant presented acoustic speech production theory. It assumes that the source of voiced speech sounds, glottal flow, is independent of the filter, vocal tract. However, acoustic simulations based on the physical speech production models show that, especially when the fundamental frequency (F0) of source harmonics approaches to the first formant frequency (F1) of vocal tract filter, the filter has s...
Continuous dimensionality characterization of image structures Felsberg, Michael; Kalkan, Sinan; Kruger, Norbert (Elsevier BV, 2009-05-04) Intrinsic dimensionality is a concept introduced by statistics and later used in image processing to measure the dimensionality of a data set. In this paper, we introduce a continuous representation of the intrinsic dimension of an image patch in terms of its local spectrum or, equivalently, its gradient field. By making use of a cone structure and barycentric co-ordinates, we can associate three confidences to the three different ideal cases of intrinsic dimensions corresponding to homogeneous image patche...
Column Parallel Incremental Zoom ADC for Uncooled Imaging Applications Beyenir, Aycan; Akın, Tayfun; Department of Electrical and Electronics Engineering (2022-8-11) This thesis presents a new column-parallel analog to digital converter (ADC) integrated circuit with high precision (20-bit), high signal-to-noise ratio (119.5 dB), low power incremental Zoom ADC architecture and demonstrates its integration to a 384x288 pixel format analog output microbolometer readout integrated circuit (ROIC) with 25µm pixel pitch. The Zoom ADC is a hybrid of a low-resolution successive approximation register (SAR) ADC and a high-resolution Delta-Sigma ADC. The main aim of this design is...

Citation Formats

D. Acar, M. Karci, H. Ilk, and M. Demirekler, “Wireless speech recognition using fixed point mixed excitation linear prediction (MELP) vocoder,” 2002, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/55562.