Architectures for multi-threaded MVC-compliant multi-view video decoding and benchmark tests

2010-06-01
Akar, Gözde
Akar, Gözde
Tekalp, A. Murat
3D video based on stereo/multi-view representations is becoming widely popular. Real-time encoding/decoding of such video is an important concern as the number and spatial/temporal resolution of views increase. We present a systematic method for design and optimization of multi-threaded multi-view video encoding/decoding algorithms using multi-core processors and provide benchmark results for real-time decoding. The proposed multi-core decoding architectures are compliant with the current MVC extension of H.264/AVC international standard, and enable multi-threaded processing with negligible loss of encoding efficiency and minimum processing overhead. Benchmark results show that multi-core processors and multi-threading decoding are necessary for real-time high-definition multi-view video decoding and display.
SIGNAL PROCESSING-IMAGE COMMUNICATION

Suggestions

MULTI-THREADED ARCHITECTURES AND BENCHMARK TESTS FOR REAL-TIME MULTI-VIEW VIDEO DECODING
Akar, Gözde; TEKALP, AHMET MURAT (2009-07-03)
3D video based on multi-view representations is becoming widely popular. Real-time encoding/decoding of such video is an important concern as the number and resolution of views increase. We present systematic methods for design and optimization of real-time multi-view video encoding/decoding algorithms using multi-core processors and provide benchmark results. The proposed multi-core decoding architectures are fully compliant with the current JVT-MVC international standard, and enable multi-threaded process...
New method for the fusion of complementary information from infrared and visual images for object detection
Ulusoy, İlkay (Institution of Engineering and Technology (IET), 2011-02-01)
Visual and infrared cameras have complementary properties and using them together may increase the performance of object detection applications. Although the fusion of visual and infrared information results in a better recall rate than using only one of those domains, there is always a decrease in the precision rate whereas the infrared domain on its own always has higher precision. Thus, the fusion of these domains is meaningful only for a better recall rate, which means that more foreground pixels are de...
Towards 3-D scene reconstruction from broadcast video
Imre, Evren; KNORR, Sebastian; ÖZKALAYCI, Burak; TOPAY, Ugur; Alatan, Abdullah Aydın; SİKORA, Thomas (Elsevier BV, 2007-02-01)
Three-dimensional (3-D) scene reconstruction from broadcast video is a challenging problem with many potential applications, such as 3-D TV, free-view TV, augmented reality or three-dimensionalization of two-dimensional (2-D) media archives. In this paper, a flexible and effective system capable of efficiently reconstructing 3-D scenes from broadcast video is proposed, with the assumption that there is relative motion between camera and scene/objects. The system requires no a priori information and input, o...
End-to-end stereoscopic video streaming with content-adaptive rate and format control
Aksay, Anil; Pehlivan, Selen; Akar, Gözde; Bilen, Cagdas; OZCELEBİ, Tanir; Civanlar, M. Reha; Tekalp, A. Murat (Elsevier BV, 2007-02-01)
We address efficient compression and real-time streaming of stereoscopic video over the current Internet. We first propose content-adaptive stereo video coding (CA-SC), where additional coding gain, over that can be achieved by exploiting only inter-view correlations, is targeted by clown-sampling one of the views spatially or temporally depending on the content, based on the well-known theory that the human visual system can perceive high frequencies in three-dimensional (3D) from the higher quality view. ...
Using multi-modal 3D contours and their relations for vision and robotics
BAŞESKİ, Emre; Pugeault, Nicolas; Kalkan, Sinan; BODENHAGEN, Leon; Piater, Justus H.; KRÜGER, Norbert (Elsevier BV, 2010-11-01)
In this work, we make use of 3D contours and relations between them (namely, coplanarity, cocolority, distance and angle) for four different applications in the area of computer vision and vision-based robotics. Our multi-modal contour representation covers both geometric and appearance information. We show the potential of reasoning with global entities in the context of visual scene analysis for driver assistance, depth prediction, robotic grasping and grasp learning. We argue that, such 3D global reasoni...
Citation Formats
G. Akar, G. Akar, and A. M. Tekalp, “Architectures for multi-threaded MVC-compliant multi-view video decoding and benchmark tests,” SIGNAL PROCESSING-IMAGE COMMUNICATION, pp. 325–334, 2010, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/38638.