Localization Uncertainty in Time-Amplitude Stereophonic Reproduction

Download

index.pdf

Date

2020-01-01

Author

De Sena, Enzo
Cvetkovic, Zoran
Hacıhabiboğlu, Hüseyin
Moonen, Marc
van Waterschoot, Toon

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

218
views

0
downloads

This article studies the effects of inter-channel time and level differences in stereophonic reproduction on perceived localization uncertainty, which is defined as how difficult it is for a listener to tell where a sound source is located. Towards this end, a computational model of localization uncertainty is proposed first. The model calculates inter-aural time and level difference cues, and compares them to those associated to free-field point-like sources. The comparison is carried out using a particular distance functional that replicates the increased uncertainty observed experimentally with inconsistent inter-aural time and level difference cues. The model is validated by formal listening tests, achieving a Pearson correlation of 0.99. The model is then used to predict localization uncertainty for stereophonic setups and a listener in central and off-central positions. Results show that amplitude methods achieve a slightly lower localization uncertainty for a listener positioned exactly in the center of the sweet spot. As soon as the listener moves away from that position, the situation reverses, with time-amplitude methods achieving a lower localization uncertainty.

Subject Keywords

Speech and Hearing, Media Technology, Linguistics and Language, Signal Processing, Acoustics and Ultrasonics, Instrumentation, Electrical and Electronic Engineering

URI

https://hdl.handle.net/11511/57922

Journal

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

DOI

https://doi.org/10.1109/taslp.2020.2975419

Collections

Graduate School of Informatics, Article

Suggestions

OpenMETU
Core

Vibrations of open-section channels: A coupled flexural and torsional wave analysis Yaman, Yavuz (Elsevier BV, 1997-07-03) An exact analytical method is presented for the analysis of forced vibrations of uniform, open-section channels. The centroid and the shear center of the channel cross-sections considered do not coincide; hence the flexural and the torsional vibrations are coupled. In the context of this study, the type of any existing coupling is defined in terms of the independent motions which are coupled through mass and/or stiffness terms. Hence, if the flexural vibrations in one direction are coupled with the torsiona...
Dynamic Speech Spectrum Representation and Tracking Variable Number of Vocal Tract Resonance Frequencies With Time-Varying Dirichlet Process Mixture Models Özkan, Emre; Demirekler, Muebeccel (Institute of Electrical and Electronics Engineers (IEEE), 2009-11-01) In this paper, we propose a new approach for dynamic speech spectrum representation and tracking vocal tract resonance (VTR) frequencies. The method involves representing the spectral density of the speech signals as a mixture of Gaussians with unknown number of components for which time-varying Dirichlet process mixture model (DPM) is utilized. In the resulting representation, the number of formants is allowed to vary in time. The paper first presents an analysis on the continuity of the formants in the sp...
Joint spatial and temporal channel-shortening techniques for frequency selective fading MIMO channels Toker, Canan; Chambers, JA; Baykal, Buyurman (2005-02-01) It is well understood that the maximum likelihood estimator is a powerful equalisation technique for frequency selective fading channels, and in particular for MIMO systems. The complexity of this estimator, however, grows exponentially with the number of users and multipath taps, hence limiting the use of this algorithm in MIMO systems. In the paper, the authors propose a joint spatial and temporal channel-shortening filter as a pre-processor to reduce significantly the complexity of a maximum likelihood e...
Language learning from the perspective of nonlinear dynamic systems Hohenberger, Annette Edeltraud; Peltzer-Karpf, Annemarie (Walter de Gruyter GmbH, 2009-01-01) This article outlines a nonlinear dynamic systems approach to language learning on the basis of developmental cognitive neuroscience. Language learning, on this view, is a process of experience-dependent shaping and selection of broadly defined domain-general and domain-specific genetic predispositions. The central concept of development is (neuro) cognitive,e growth in terms of self-organization. Linguistic structure-building is synergetic and emergent insofar as the acquisition of a critical mass of eleme...
Nonlinear interactive source-filter models for speech KOÇ, Turgay; Çiloğlu, Tolga (2016-03-01) The linear source-filter model of speech production assumes that the source of the speech sounds is independent of the filter. However, acoustic simulations based on the physical speech production models show that when the fundamental frequency of the source harmonics approaches the first formant of the vocal tract filter, the filter has significant effects on the source due to the nonlinear coupling between them. In this study, two interactive system models are proposed under the quasi steady Bernoulli flo...

Citation Formats

E. De Sena, Z. Cvetkovic, H. Hacıhabiboğlu, M. Moonen, and T. van Waterschoot, “Localization Uncertainty in Time-Amplitude Stereophonic Reproduction,” IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, pp. 1000–1015, 2020, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/57922.