Sparse Representations with Legendre Kernels for DOA Estimation and Acoustic Source Separation

Date

2021-01-01

Author

Coteli, Mert Burkay
Hacıhabiboğlu, Hüseyin

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

41
views

0
downloads

Recording multiple sound sources in a reverberant environment results in convolutive mixtures. Sound sources can be extracted from microphone array recordings of such mixtures using acoustic source separation techniques. Acoustic source separation using recordings obtained from rigid spherical microphone arrays (RSMA) benefit from the representation of sound fields as series of spherical harmonics. More specifically, RSMAs afford increased flexibility in acoustic beamforming and spatial filtering. We propose a data-driven DOA estimation and acoustic source separation method based on a dictionary-based sparse decomposition of sound fields. The proposed method involves identifying the time-frequency bins with contributions from a single source only and those with sensor noise or diffuse sound field components. The former set of bins is used in DOA estimation and beamforming in the sparse decomposition domain. The latter set is used to calculate the diffuse field covariance matrix used in Wiener post-filtering to improve the source separation performance further. We demonstrate the utility of the proposed method via extensive objective and subjective evaluations.

URI

https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85111137405&origin=inward
https://hdl.handle.net/11511/91547

Journal

IEEE/ACM Transactions on Audio Speech and Language Processing

DOI

https://doi.org/10.1109/taslp.2021.3091845

Collections

Graduate School of Informatics, Article

Suggestions

OpenMETU
Core

Spherical harmonics based acoustic scene analysis for object-based audio Çöteli, Mert Burkay; Hacıhabiboğlu, Hüseyin; Department of Information Systems (2021-2-19) Object-based audio relies on elemental audio signals from individual sound sources and their associated metadata to be reconstructed at the listener side. While defining audio objects in a production setting is straightforward, it is not trivial to extract audio objects from more realistic recording scenarios such as concerts. Thus, existing object-based audio standards also define scene-based formats alongside objectbased representations that provide immersive audio, but without the flexibility provided by...
3D perceptual soundfield reconstruction via sound field extrapolation Erdem, Eg; Hacıhabiboğlui Hüseyin.; Department of Multimedia Informatics (2020) Perceptual sound field reconstruction (PSR) is a spatial audio recording and reproduction method based on the application of stereophonic panning laws in microphone array design. PSR allows rendering a perceptually veridical and stable auditory perspective in the horizontal plane of the listener, and involves recording using nearcoincident microphone arrays. This thesis extends the two dimensional PSR concept to three dimensions and allows reconstructing an arbitrary sound field based on measurements with a...
Multiple Sound Source Localization With Steered Response Power Density and Hierarchical Grid Refinement COTELI, Mert Burkay; OLGUN, Orhun; Hacıhabiboğlu, Hüseyin (2018-11-01) Estimation of the direction-of-arrival (DOA) of sound sources is an important step in sound field analysis. Rigid spherical microphone arrays allow the calculation of a compact spherical harmonic representation of the sound field. The standard method for analyzing sound fields recorded using such arrays is steered response power (SRP) maps wherein the source DOA can be estimated as the steering direction that maximizes the output power of a maximally directive beam. This approach is computationally costly s...
Data Imputation Through the Identification of Local Anomalies Ozkan, Huseyin; Pelvan, Ozgun Soner; Kozat, Suleyman S. (Institute of Electrical and Electronics Engineers (IEEE), 2015-10) We introduce a comprehensive and statistical framework in a model free setting for a complete treatment of localized data corruptions due to severe noise sources, e.g., an occluder in the case of a visual recording. Within this framework, we propose: 1) a novel algorithm to efficiently separate, i.e., detect and localize, possible corruptions from a given suspicious data instance and 2) a maximum a posteriori estimator to impute the corrupted data. As a generalization to Euclidean distance, we also propose ...
NUMERICAL INVESTIGATION OF SPINNING MODE TRANSMISSION THROUGH VARIABLE AREA ANNULAR DUCTS WITH FLOW Özyörük, Yusuf; Tester, Brian J. (2015-07-16) Understanding sound propagation through variable area ducts continues to be important for controlling turbomachinery noise. In analytical treatment of the problem it is usually assumed that duct cross sections vary slowly and no mode scattering takes place. It is the purpose of this work to investigate numerically the effects of cross-sectional changes of varying degree on transmission of acoustic modes typical to turbomachinery. First, sound fields of interest through such ducts are obtained numerically by...

Citation Formats

M. B. Coteli and H. Hacıhabiboğlu, “Sparse Representations with Legendre Kernels for DOA Estimation and Acoustic Source Separation,” IEEE/ACM Transactions on Audio Speech and Language Processing, pp. 2296–2309, 2021, Accessed: 00, 2021. [Online]. Available: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85111137405&origin=inward.