Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Content-based audio management and retrieval system for news broadcasts
Download
index.pdf
Date
2009
Author
Doğan, Ebru
Metadata
Show full item record
Item Usage Stats
249
views
87
downloads
Cite This
The audio signals can provide rich semantic cues for analyzing multimedia content, so audio information has been recently used for content-based multimedia indexing and retrieval. Due to growing amount of audio data, demand for efficient retrieval techniques is increasing. In this thesis work, we propose a complete, scalable and extensible audio based content management and retrieval system for news broadcasts. The proposed system considers classification, segmentation, analysis and retrieval of an audio stream. In the sound classification and segmentation stage, a sound stream is segmented by classifying each sub segment into silence, pure speech, music, environmental sound, speech over music, and speech over environmental sound in multiple steps. Support Vector Machines and Hidden Markov Models are employed for classification and these models are trained by using different sets of MPEG-7 features. In the analysis and retrieval stage, two alternatives exist for users to query audio data. The first of these isolates user from main acoustic classes by providing semantic domain based fuzzy classes. The latter offers users to query audio by giving an audio sample in order to find out the similar segments or by requesting expressive summary of the content directly. Additionally, a series of tests was conducted on audio tracks of TRECVID news broadcasts to evaluate the performance of the proposed solution.
Subject Keywords
Computer enginnering.
,
Content-Based Retrieval.
URI
http://etd.lib.metu.edu.tr/upload/12611018/index.pdf
https://hdl.handle.net/11511/18543
Collections
Graduate School of Natural and Applied Sciences, Thesis
Suggestions
OpenMETU
Core
Structural and semantic modeling of audio for content-based querying and browsing
Sert, Mustafa; Baykal, Buyurman; Yazıcı, Adnan (2006-01-01)
A typical content-based audio management system deals with three aspects namely audio segmentation and classification, audio analysis, and content-based retrieval of audio. In this paper, we integrate the three aspects of content-based audio management into a single framework and propose an efficient method for flexible querying and browsing of auditory data. More specifically, we utilize two robust feature sets namely MPEG-7 Audio Spectrum Flatness (ASF) and Mel Frequency Cepstral Coefficients (MFCC) as th...
Multimedia Information Retrieval Using Fuzzy Cluster-Based Model Learning
Sattari, Saeid; Yazıcı, Adnan (2017-07-12)
Multimedia data, particularly digital videos, which contain various modalities (visual, audio, and text) are complex and time consuming to model, process, and retrieve. Therefore, efficient methods are required for retrieval of such complex data. In this paper, we propose a multimodal query level fusion approach using a fuzzy cluster-based learning method to improve the retrieval performance of multimedia data. Experimental results on a real dataset demonstrate that employing fuzzy clustering achieves notab...
Spherical harmonics based acoustic scene analysis for object-based audio
Çöteli, Mert Burkay; Hacıhabiboğlu, Hüseyin; Department of Information Systems (2021-2-19)
Object-based audio relies on elemental audio signals from individual sound sources and their associated metadata to be reconstructed at the listener side. While defining audio objects in a production setting is straightforward, it is not trivial to extract audio objects from more realistic recording scenarios such as concerts. Thus, existing object-based audio standards also define scene-based formats alongside objectbased representations that provide immersive audio, but without the flexibility provided by...
Matrix quantization and mixed excitation based linear predictive speech coding at very low bit rates
Özaydın, Selma; Baykal, Buyurman (Elsevier BV, 2003-10)
A matrix quantization scheme and a very low bit rate vocoder is developed to obtain good quality speech for low capacity communication links. The new matrix quantization method operates at bit rates between 400 and 800 bps and using a 25 ms linear predictive coding (LPC) analysis frame, spectral distortion about 1 dB is achieved at 800 bps. Techniques for improving the performance at very low bit rate vocoding include quantization of residual line spectral frequency (LSF) vectors, multistage matrix quantiza...
Wireless speech recognition using fixed point mixed excitation linear prediction (MELP) vocoder
Acar, D; Karci, MH; Ilk, HG; Demirekler, Mübeccel (2002-07-19)
A bit stream based front-end for wireless speech recognition system that operates on fixed point mixed excitation linear prediction (MELP) vocoder is presented in this paper. Speaker dependent, isolated word recognition accuracies obtained from conventional and bit stream based front-end systems are obtained and their statistical significance is discussed. Feature parameters are extracted from original (wireline) and decoded speech (conventional) and from the quantized spectral information (bit stream) of t...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
E. Doğan, “Content-based audio management and retrieval system for news broadcasts,” M.S. - Master of Science, Middle East Technical University, 2009.