Spherical harmonics based acoustic scene analysis for object-based audio

Çöteli, Mert Burkay
Object-based audio relies on elemental audio signals from individual sound sources and their associated metadata to be reconstructed at the listener side. While defining audio objects in a production setting is straightforward, it is not trivial to extract audio objects from more realistic recording scenarios such as concerts. Thus, existing object-based audio standards also define scene-based formats alongside objectbased representations that provide immersive audio, but without the flexibility provided by object-based audio. Presently, there is no reliable approach to transcode from scene-based format to object-based format. This thesis aims to develop acoustic scene analysis techniques to extract the directions of arrival of active sources and separate them from scene-based audio representations. Two DOA estimation methods and three source separation methods that use signals from rigid spherical microphone arrays are proposed for this purpose. The proposed methods allow analyzing scenes comprising multiple coherent or nearly coherent sources in highly reverberant and non-reverberant environments. We describe the algorithms, assess their performance objectively and subjectively and analyse their computational requirements.
Citation Formats
M. B. Çöteli, “Spherical harmonics based acoustic scene analysis for object-based audio,” Ph.D. - Doctoral Program, 2021.