Spherical harmonics based acoustic scene analysis for object-based audio

Çöteli, Mert Burkay
Object-based audio relies on elemental audio signals from individual sound sources and their associated metadata to be reconstructed at the listener side. While defining audio objects in a production setting is straightforward, it is not trivial to extract audio objects from more realistic recording scenarios such as concerts. Thus, existing object-based audio standards also define scene-based formats alongside objectbased representations that provide immersive audio, but without the flexibility provided by object-based audio. Presently, there is no reliable approach to transcode from scene-based format to object-based format. This thesis aims to develop acoustic scene analysis techniques to extract the directions of arrival of active sources and separate them from scene-based audio representations. Two DOA estimation methods and three source separation methods that use signals from rigid spherical microphone arrays are proposed for this purpose. The proposed methods allow analyzing scenes comprising multiple coherent or nearly coherent sources in highly reverberant and non-reverberant environments. We describe the algorithms, assess their performance objectively and subjectively and analyse their computational requirements.


Coteli, Mert Burkay; Hacıhabiboğlu, Hüseyin (2018-09-20)
Acoustic source separation refers to the extraction of individual source signals from microphone array recordings of multiple sources made in multipath environments such as rooms. The most straightforward approach to acoustic source separation involves spatial filtering via beamforming. While beamforming works well for a few sources and under low reverberation, its performance diminishes for a high number of sources and/or high reverberation. An informed acoustic source separation method based on the applic...
Thin-Film PZT based Multi-Channel Acoustic MEMS Transducer for Cochlear Implant Applications
Yüksel, Muhammed Berat; Külah, Haluk (2021-01-01)
AuthorThis paper presents a multi-channel acoustic transducer that works within the audible frequency range (250-5500 Hz) and mimics the operation of the cochlea by filtering incoming sound. The transducer is composed of eight thin film piezoelectric cantilever beams with different resonance frequencies. The transducer is well suited to be implanted in middle ear cavity with an active volume of 5 mm × 5 mm × 0.62 mm and mass of 4.8 mg. Resonance frequencies and piezoelectric outputs of the bea...
Multiple Sound Source Localization With Steered Response Power Density and Hierarchical Grid Refinement
COTELI, Mert Burkay; OLGUN, Orhun; Hacıhabiboğlu, Hüseyin (2018-11-01)
Estimation of the direction-of-arrival (DOA) of sound sources is an important step in sound field analysis. Rigid spherical microphone arrays allow the calculation of a compact spherical harmonic representation of the sound field. The standard method for analyzing sound fields recorded using such arrays is steered response power (SRP) maps wherein the source DOA can be estimated as the steering direction that maximizes the output power of a maximally directive beam. This approach is computationally costly s...
Blind source separation and directional audio synthesis for binaural auralization of multiple sound sources using microphone array recordings
Günel Kılıç, Banu; Hacıhabiboğlu, Hüseyin (2008-01-01)
Microphone array signal processing techniques are extensively used for sound source localisation, acoustical characterisation and sound source separation, which are related to audio analysis. However, the use of microphone arrays for auralisation, which is generally related to synthesis, has been limited so far. This paper proposes a method for binaural auralisation of multiple sound sources based on blind source separation (BSS) and binaural audio synthesis. A BSS algorithm is introduced that exploits the ...
Data-driven Threshold Selection for Direct Path Dominance Test
Olgun, Orhun; Hacıhabiboğlu, Hüseyin (2019-09-09)
Direction-of-arrival estimation methods, when used with recordings made in enclosures are negatively affected by the reflections and reverberation in that enclosure. Direct path dominance (DPD) test was proposed as a pre-processing stage which can provide better DOA estimates by selecting only the time-frequency bins with a single dominant sound source component prior to DOA estimation, thereby reducing the total computational cost. DPD test involves selecting bins for which the ratio of the two largest sin...
Citation Formats
M. B. Çöteli, “Spherical harmonics based acoustic scene analysis for object-based audio,” Ph.D. - Doctoral Program, Middle East Technical University, 2021.