Instrument based wavelet packet decomposition for audio feature extraction

Feature extraction from audio data is a major concern in computer assisted music applications and content based audio retrieval. For general non-stationary signals, wavelet packet decomposition is used with entropy functions for best basis search. Musical instruments have well defined frequency ranges. Thus when audio data containing a solo instrument is concerned, wavelet packet decomposition may be adapted to that instrument's individual characteristics. The method discussed in this paper uses a number of wavelet packet coefficients' averages as feature vectors in a multi-layer perceptron (MLP) network to classify the type of the instrument and form an instrument specific wavelet packet tree for extraction of further features (e.g. pitch, duration, brightness etc.) The results presented for flute give accurate estimates of pitch and duration.
Musical instrument recognition with wavelet envelopes
Hacıhabiboğlu, Hüseyin (2002-09-16)
Automatic recognition of instrument type from raw audio data containing monophonic music is a fundamental problem for audio content analysis. There are many methods for the solution of this problem, which use common spectro-temporal properties like cepstral coefficients or spectral envelopes. A new method for instrument recognition utilising short-time amplitude envelopes of wavelet coefficients as feature vectors is presented. The classification engine is a distinctively small multilayer perceptron (MLP) n...
Spherical harmonics based acoustic scene analysis for object-based audio
Çöteli, Mert Burkay; Hacıhabiboğlu, Hüseyin; Department of Information Systems (2021-2-19)
Object-based audio relies on elemental audio signals from individual sound sources and their associated metadata to be reconstructed at the listener side. While defining audio objects in a production setting is straightforward, it is not trivial to extract audio objects from more realistic recording scenarios such as concerts. Thus, existing object-based audio standards also define scene-based formats alongside objectbased representations that provide immersive audio, but without the flexibility provided by...
Sound source localization: Conventional methods and intensity vector direction exploitation
Günel Kılıç, Banu; Hacıhabiboğlu, Hüseyin (IGI Global, 2011-01-01)
Automatic sound source localization has recently gained interest due to its various applications that range from surveillance to hearing aids, and teleconferencing to human computer interaction. Automatic sound source localization may refer to the process of determining only the direction of a sound source, which is known as the direction-of-arrival estimation, or also its distance in order to obtain its coordinates. Various methods have previously been proposed for this purpose. Many of these methods use t...
Structural and semantic modeling of audio for content-based querying and browsing
Sert, Mustafa; Baykal, Buyurman; Yazıcı, Adnan (2006-01-01)
A typical content-based audio management system deals with three aspects namely audio segmentation and classification, audio analysis, and content-based retrieval of audio. In this paper, we integrate the three aspects of content-based audio management into a single framework and propose an efficient method for flexible querying and browsing of auditory data. More specifically, we utilize two robust feature sets namely MPEG-7 Audio Spectrum Flatness (ASF) and Mel Frequency Cepstral Coefficients (MFCC) as th...
Blind source separation and directional audio synthesis for binaural auralization of multiple sound sources using microphone array recordings
Günel Kılıç, Banu; Hacıhabiboğlu, Hüseyin (2008-01-01)
Microphone array signal processing techniques are extensively used for sound source localisation, acoustical characterisation and sound source separation, which are related to audio analysis. However, the use of microphone arrays for auralisation, which is generally related to synthesis, has been limited so far. This paper proposes a method for binaural auralisation of multiple sound sources based on blind source separation (BSS) and binaural audio synthesis. A BSS algorithm is introduced that exploits the ...
