Structural and semantic modeling of audio for content-based querying and browsing

2006-01-01
A typical content-based audio management system deals with three aspects namely audio segmentation and classification, audio analysis, and content-based retrieval of audio. In this paper, we integrate the three aspects of content-based audio management into a single framework and propose an efficient method for flexible querying and browsing of auditory data. More specifically, we utilize two robust feature sets namely MPEG-7 Audio Spectrum Flatness (ASF) and Mel Frequency Cepstral Coefficients (MFCC) as the underlying features in order to improve the content-based retrieval accuracy, since both features have some advantages for distinct types of audio (e.g., music and speech). The proposed system provides a wide range of opportunities to query and browse an audio data by content, such as querying and browsing for a chorus section, sound effects, and query-by-example. In addition, the clients can express their queries in the form of point, range, and k-nearest neighbor, which are particularly significant in the multimedia domain.
FLEXIBLE QUERY ANSWERING SYSTEMS, PROCEEDINGS

Suggestions

Content-based audio management and retrieval system for news broadcasts
Doğan, Ebru; Yazıcı, Adnan; Department of Computer Engineering (2009)
The audio signals can provide rich semantic cues for analyzing multimedia content, so audio information has been recently used for content-based multimedia indexing and retrieval. Due to growing amount of audio data, demand for efficient retrieval techniques is increasing. In this thesis work, we propose a complete, scalable and extensible audio based content management and retrieval system for news broadcasts. The proposed system considers classification, segmentation, analysis and retrieval of an audio st...
Improving performance of a remote robotic teleoperation over the internet
Arslan, Mehmet Selçuk; Konukseven, Erhan İlhan; Department of Mechanical Engineering (2005)
In this thesis study, it is aimed to improve the performance of an Internet-based teleoperation system enabling the remote operation of a 6 DOF industrial robot. In order to improve the safety and efficiency of the teleoperation, stability and synchronization (hand-eye coordination) are considered. The selected communication medium between the human operator and remote robot is the Internet. The variable time delays and nondeterministic characteristics of the Internet may lead to instability of the teleoper...
Generating expressive summaries for speech and musical audio using self-similarity clues
Sert, Mustafa; Baykal, Buyurman; Yazıcı, Adnan (2006-07-12)
We present a novel algorithm for structural analysis of audio to detect repetitive patterns that are suitable for content-based audio information retrieval systems, since repetitive patterns can provide valuable information about the content of audio, such as a chorus or a concept. The Audio Spectrum Flatness (ASF) feature of the MPEG7 standard, although not having been considered as much as other feature types, has been utilized and evaluated as the underlying feature set. Expressive summaries are chosen a...
Dynamic performances of kinematically and dynamically adjustable planar mechanisms
İyiay, Erdinç; Soylu, Reşit; Department of Mechanical Engineering (2003)
In this thesis, the dynamic performances of kinematically and dynamically adjustable planar mechanisms have been investigated. An adjustable mechanism is here defined to be a mechanism where some of the kinematic and/or dynamic parameters are changed in a controlled manner in order to optimize the dynamic behaviour of the mechanism in spite of variable operating conditions. Here, variable operating conditions refer to variable load(s) on the mechanism and/or variable desired input motion. The dynamic behavi...
Modular embedded system design / implementation for mechatronic education and research
Nursal, Ali Özgü; Koku, Ahmet Buğra; Department of Mechanical Engineering (2007)
In this thesis a modular embedded system for Mechatronics education and research is designed and implemented. Four types of control boards are manufactured and related software is developed at board and PC level. A star like topology is used for boards architecture. One bridge board is responsible for handling communication between the PC and all the other boards that are connected independently to that bridge board. For PC communication Universal Seial Bus (USB), for inter peripheral communication serial p...
Citation Formats
M. Sert, B. Baykal, and A. Yazıcı, “Structural and semantic modeling of audio for content-based querying and browsing,” FLEXIBLE QUERY ANSWERING SYSTEMS, PROCEEDINGS, pp. 319–330, 2006, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/54087.