Towards 3-D scene reconstruction from broadcast video

Date

2007-02-01

Author

Imre, Evren
KNORR, Sebastian
ÖZKALAYCI, Burak
TOPAY, Ugur
Alatan, Abdullah Aydın
SİKORA, Thomas

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

178
views

0
downloads

Three-dimensional (3-D) scene reconstruction from broadcast video is a challenging problem with many potential applications, such as 3-D TV, free-view TV, augmented reality or three-dimensionalization of two-dimensional (2-D) media archives. In this paper, a flexible and effective system capable of efficiently reconstructing 3-D scenes from broadcast video is proposed, with the assumption that there is relative motion between camera and scene/objects. The system requires no a priori information and input, other than the video sequence itself, and capable of estimating the internal and external camera parameters and performing a 3-D motion-based segmentation, as well as computing a dense depth field. The system also serves as a showcase to present some novel approaches for moving object segmentation, sparse and dense reconstruction problems. According to the simulations for both synthetic and real data, the system achieves a promising performance for typical TV content, indicating that it is a significant step towards the 3-D reconstruction of scenes from broadcast video.

Subject Keywords

Signal Processing, Electrical and Electronic Engineering, Software, Computer Vision and Pattern Recognition

URI

https://hdl.handle.net/11511/48002

Journal

SIGNAL PROCESSING-IMAGE COMMUNICATION

DOI

https://doi.org/10.1016/j.image.2006.11.011

Collections

Department of Electrical and Electronics Engineering, Article

Suggestions

OpenMETU
Core

New method for the fusion of complementary information from infrared and visual images for object detection Ulusoy, İlkay (Institution of Engineering and Technology (IET), 2011-02-01) Visual and infrared cameras have complementary properties and using them together may increase the performance of object detection applications. Although the fusion of visual and infrared information results in a better recall rate than using only one of those domains, there is always a decrease in the precision rate whereas the infrared domain on its own always has higher precision. Thus, the fusion of these domains is meaningful only for a better recall rate, which means that more foreground pixels are de...
Architectures for multi-threaded MVC-compliant multi-view video decoding and benchmark tests Akar, Gözde; Akar, Gözde; Tekalp, A. Murat (Elsevier BV, 2010-06-01) 3D video based on stereo/multi-view representations is becoming widely popular. Real-time encoding/decoding of such video is an important concern as the number and spatial/temporal resolution of views increase. We present a systematic method for design and optimization of multi-threaded multi-view video encoding/decoding algorithms using multi-core processors and provide benchmark results for real-time decoding. The proposed multi-core decoding architectures are compliant with the current MVC extension of H...
End-to-end stereoscopic video streaming with content-adaptive rate and format control Aksay, Anil; Pehlivan, Selen; Akar, Gözde; Bilen, Cagdas; OZCELEBİ, Tanir; Civanlar, M. Reha; Tekalp, A. Murat (Elsevier BV, 2007-02-01) We address efficient compression and real-time streaming of stereoscopic video over the current Internet. We first propose content-adaptive stereo video coding (CA-SC), where additional coding gain, over that can be achieved by exploiting only inter-view correlations, is targeted by clown-sampling one of the views spatially or temporally depending on the content, based on the well-known theory that the human visual system can perceive high frequencies in three-dimensional (3D) from the higher quality view. ...
Using multi-modal 3D contours and their relations for vision and robotics BAŞESKİ, Emre; Pugeault, Nicolas; Kalkan, Sinan; BODENHAGEN, Leon; Piater, Justus H.; KRÜGER, Norbert (Elsevier BV, 2010-11-01) In this work, we make use of 3D contours and relations between them (namely, coplanarity, cocolority, distance and angle) for four different applications in the area of computer vision and vision-based robotics. Our multi-modal contour representation covers both geometric and appearance information. We show the potential of reasoning with global entities in the context of visual scene analysis for driver assistance, depth prediction, robotic grasping and grasp learning. We argue that, such 3D global reasoni...
Dominant sets based movie scene detection SAKARYA, Ufuk; Telatar, Ziya; Alatan, Abdullah Aydın (Elsevier BV, 2012-01-01) Multimedia indexing and retrieval has become a challenging topic in organizing huge amount of multimedia data. This problem is not a trivial task for large visual databases; hence, segmentation into low- and high-level temporal video segments might improve the realization of this task. In this paper, we introduce a weighted undirected graph-based movie scene detection approach to detect semantically meaningful temporal video segments. The method is based on the idea of finding the dominant scene of the vide...

Citation Formats

E. Imre, S. KNORR, B. ÖZKALAYCI, U. TOPAY, A. A. Alatan, and T. SİKORA, “Towards 3-D scene reconstruction from broadcast video,” SIGNAL PROCESSING-IMAGE COMMUNICATION, pp. 108–126, 2007, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/48002.