Multichannel Audio Coding Based on Analysis by Synthesis

2011-04-01
Elfitri, Ikhwana
Günel Kılıç, Banu
AHMET, KONDOZ
Spatial hearing enables translation of an auditory scene into a perceived 3-D image by interpreting the acoustic cues related to the sounding objects, their locations, and the physical characteristics of the space. Spatial audio production requires multichannel audio signals in order to convey this information and increase the realism of a real or virtual environment for applications such as the home entertainment, virtual reality, and remote collaboration. As demand to spatial audio continues to expand, efficient coding of multichannel audio content becomes more and more important. This paper provides an overview of some well-known multichannel audio coding techniques and presents a new coding framework for improving the objective fidelity of the decoded signals. A closed-loop encoding system based on analysis-by-synthesis (AbS) principle applied on the MPEG surround (MPS) architecture is described. Comparison results are presented, which show that significant improvements can be achieved with a closed-loop system instead of the conventional open-loop system.
PROCEEDINGS OF THE IEEE

Suggestions

Multiview 3d reconstruction of a scene containing independently moving objects
Tola, Engin; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2005)
In this thesis, the structure from motion problem for calibrated scenes containing independently moving objects (IMO) has been studied. For this purpose, the overall reconstruction process is partitioned into various stages. The first stage deals with the fundamental problem of estimating structure and motion by using only two views. This process starts with finding some salient features using a sub-pixel version of the Harris corner detector. The features are matched by the help of a similarity and neighbo...
Effects of surface reflectance on local second order shape estimation in dynamic scenes.
Dövencioğlu, Nahide Dicle; Ben-Shahar, O; Doerschner, K (2015-10-01)
In dynamic scenes, relative motion between the object, the observer, and/or the environment projects as dynamic visual information onto the retina (optic flow) that facilitates 3D shape perception. When the object is diffusely reflective, e.g. a matte painted surface, this optic flow is directly linked to object shape, a property found at the foundations of most traditional shape-from-motion (SfM) schemes. When the object is specular, the corresponding specular flow is related to shape curvature, a regime c...
Multiview video compression with 1-D transforms
Karasoy, Burcu; Kamışlı, Fatih; Department of Electrical and Electronics Engineering (2013)
In previous research, it has been shown that motion compensated prediction residuals can have 1-D structures in many regions and that 1-D directional DCTs can compress such regions more e ciently than the conventional 2-D DCT. In this thesis, we analyze the spatial characteristics of the disparity compensated prediction residuals and the analysis results show that, similar to motion compensated prediction residuals, many regions of disparity compensated prediction residuals also have 1-D structures. Thus, w...
Panoramic recording and reproduction of multichannel audio using a circular microphone array
Hacıhabiboğlu, Hüseyin (2009-10-18)
Multichannel audio reproduction generally suffers from one or both of the following problems: i) the recorded audio has to be artificially manipulated to provide the necessary spatial cues, which reduces the consistency of the reproduced sound field with the actual one, and ii) reproduction is not panoramic, which degrades realism when the listener is not seated in a desired ideal position facing the center channel. A recording method using a circularly symmetric array of differential microphones, and a rep...
Classification in Frequency Domain of EEG Signals of Motor Imagery for Brain Computer Interfaces
Halıcı, Uğur (2009-05-22)
In this study the classification of the EEG signals recorded during motor imagery for curser movement in brain computer interfaces is examined, in which the feature vectors obtained in frequency domain is used and then the linear transformations are applied for reducing the size of the feature vectors.
Citation Formats
I. Elfitri, B. Günel Kılıç, and K. AHMET, “Multichannel Audio Coding Based on Analysis by Synthesis,” PROCEEDINGS OF THE IEEE, pp. 657–670, 2011, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/32565.