Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Deep 3D semantic scene extrapolation
Download
index.pdf
Date
2018
Author
Abbasi, Ali
Metadata
Show full item record
Item Usage Stats
188
views
174
downloads
Cite This
In this thesis, we study the problem of 3D scene extrapolation with deep models. Scene extrapolation is a challenging variant of the scene completion problem, which pertains to predicting the missing part(s) of a scene. While the 3D scene completion algorithms in the literature try to fill the occluded part of a scene such as a chair behind a table, we focus on extrapolating the available half scene information to a full one, a problem that, to our knowledge, has not been studied yet. Our approaches are based on deep generative adversarial (GAN) and convolutional neural networks (CNN). As input, we take the half of 3D voxelized scenes, then our models complete the other half of scenes as output. Our baseline CNN model consists of convolutional and ReLU layers with multiple residual connections and Softmax classifier with voxel-wise cross-entropy loss function at the end. We use the baseline CNN model as the generator network in the proposed GAN model. We regularize our GAN model with a discriminator network, consisting of two internal, local-global networks to have sharper and more realistic results. Local discriminator takes only the generated part of scenes, while global discriminator network takes not only the generated part but also the first real part to distinguish between real and fake scenes. Using the CNN model we also propose a hybrid model, which takes the top view projection of input scene in parallel with the 3D input. We train and evaluate our models on the synthetic 3D SUNCG dataset. We show that our trained networks can predict the other half of the scenes, and complete the objects correctly with suitable lengths. With a discussion on the challenges, we propose scene extrapolation as a challenging testbed for future research in deep learning.
Subject Keywords
Image processing.
,
Three-dimensional imaging.
,
Convolutions (Mathematics).
,
Neural networks (Computer science).
URI
http://etd.lib.metu.edu.tr/upload/12622458/index.pdf
https://hdl.handle.net/11511/27568
Collections
Graduate School of Natural and Applied Sciences, Thesis
Suggestions
OpenMETU
Core
2D/3D human pose estimation using deep convolutional neural nets
Kocabaş, Muhammed; Akbaş, Emre; Department of Computer Engineering (2019)
In this thesis, we propose algorithms to estimate 2D/3D human pose from single view images. In the first part of the thesis, we present MultiPoseNet, a novel bottom-up multiperson pose estimation architecture that combines a multi-task model with a novel assignment method. MultiPoseNet can jointly handle person detection, keypoint detection, person segmentation and pose estimation problems. The novel assignment method is implemented by the Pose Residual Network (PRN) which receives keypoint and person detec...
3D indirect shape retrieval based on hand interaction
Irmak, Erdem Can; Sahillioğlu, Yusuf; Department of Game Technologies (2017)
In this thesis, a novel 3D indirect shape analysis method is presented which successfully retrieves 3D shapes based on the hand-object interaction. In the first part of the study, the human hand information is processed and trans- ferred to the virtual environment by Leap Motion Controller. Position and rotation of the hand, the angle of the finger joints are used for this part in our method. Also, in this approach, a new type of feature, which we call inter- action point, is introduced. These interaction p...
3D object recognition using scale space of curvatures
Akagündüz, Erdem; Ulusoy, İlkay; Department of Electrical and Electronics Engineering (2011)
In this thesis, a generic, scale and resolution invariant method to extract 3D features from 3D surfaces, is proposed. Features are extracted with their scale (metric size and resolution) from range images using scale-space of 3D surface curvatures. Different from previous scale-space approaches; connected components within the classified curvature scale-space are extracted as features. Furthermore, scales of features are extracted invariant of the metric size or the sampling of the range images. Geometric ...
Visual-inertial sensor fusion for 3D urban modeling
Sırtkaya, Salim; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2013)
In this dissertation, a real-time, autonomous and geo-registered approach is presented to tackle the large scale 3D urban modeling problem using a camera and inertial sensors. The proposed approach exploits the special structures of urban areas and visual-inertial sensor fusion. The buildings in urban areas are assumed to have planar facades that are perpendicular to the local level. A sparse 3D point cloud of the imaged scene is obtained from visual feature matches using camera poses estimates, and planar ...
3D face recognition with local shape descriptors
İnan, Tolga; Halıcı, Uğur; Department of Electrical and Electronics Engineering (2011)
This thesis represents two approaches for three dimensional face recognition. In the first approach, a generic face model is fitted to human face. Local shape descriptors are located on the nodes of generic model mesh. Discriminative local shape descriptors on the nodes are selected and fed as input into the face recognition system. In the second approach, local shape descriptors which are uniformly distributed across the face are calculated. Among the calculated shape descriptors that are discriminative fo...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
A. Abbasi, “Deep 3D semantic scene extrapolation,” M.S. - Master of Science, Middle East Technical University, 2018.