Spatial 3D local descriptors for object recognition in RGB-D images

Download
2016
Loğoğlu, K. Berker
Introduction of the affordable but relatively high resolution color and depth synchronized RGB-D sensors, along with the efforts on open-source point-cloud processing tools boosted research in both computer vision and robotics. One of the key areas which have drawn particular attention is object recognition since it is one of the crucial steps for various applications. In this thesis, two spatially enhanced local 3D descriptors are proposed for object recognition tasks: Histograms of Spatial Concentric Surflet-Pairs (SPAIR) and Colored SPAIR (CoSPAIR). The proposed descriptors are compared against the state-of-the-art local 3D descriptors that are available in Point Cloud Library (PCL) and their object recognition performances are evaluated on several publicly available datasets. The experiments demonstrate that the proposed Co-SPAIR descriptor outperforms the state-of-the-art descriptors in both category-level and instance-level recognition tasks. The performance gains are observed to be up to 9.9 percentage points for category-level recognition and 16.49 percentage points for instance-level recognition over the second-best performing descriptor.

Suggestions

CoSPAIR: Colored Histograms of Spatial Concentric Surflet-Pairs for 3D object recognition
Logoglu, K. Berker; Kalkan, Sinan; Temizel, Alptekin (2016-01-01)
Introduction of RGB-D sensors together with the efforts on open-source point-cloud processing tools boosted research in both computer vision and robotics. One of the key areas which have drawn particular attention is object recognition since it is one of the crucial steps for various applications. In this paper, two spatially enhanced local 3D descriptors are proposed for object recognition tasks: Histograms of Spatial Concentric Surflet-Pairs (SPAIR) and Colored SPAIR (CoSPAIR). The proposed descriptors ar...
Stabilization of an image based tracking system
Şener, Irmak Ece; Leblebicioğlu, Mehmet Kemal; Department of Electrical and Electronics Engineering (2015)
Vision based tracking systems require high resolution images of the targets. In addition, tracking system will try to hold the tracked objects at the center of field of view of the camera to achieve robust and successful tracking. Such systems are usually placed on a platform which is to be controlled by a gimbal. The main job of the gimbal is to get rid of jitters and/or undesirable vibrations of the image platform. In this thesis, such an image platform together with its gimbal, and its controller will be...
Data-driven image captioning via salient region discovery
Kilickaya, Mert; Akkuş, Burak Kerim; Çakıcı, Ruket; Erdem, Aykut; Erdem, Erkut; İKİZLER CİNBİŞ, NAZLI (Institution of Engineering and Technology (IET), 2017-09-01)
n the past few years, automatically generating descriptions for images has attracted a lot of attention in computer vision and natural language processing research. Among the existing approaches, data-driven methods have been proven to be highly effective. These methods compare the given image against a large set of training images to determine a set of relevant images, then generate a description using the associated captions. In this study, the authors propose to integrate an object-based semantic image r...
Visual object detection and tracking using local convolutional context features and recurrent neural networks
Kaya, Emre Can; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2018)
Visual object detection and tracking are two major problems in computer vision which have important real-life application areas. During the last decade, Convolutional Neural Networks (CNNs) have received significant attention and outperformed methods that rely on handcrafted representations in both detection and tracking. On the other hand, Recurrent Neural Networks (RNNs) are commonly preferred for modeling sequential data such as video sequences. A novel convolutional context feature extension is introduc...
Range data recognition: segmentation, matching, and similarity retrieval
Yalçın Bayramoğlu, Neslihan; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2011)
The improvements in 3D scanning technologies have led the necessity for managing range image databases. Hence, the requirement of describing and indexing this type of data arises. Up to now, rather much work is achieved on capturing, transmission and visualization; however, there is still a gap in the 3D semantic analysis between the requirements of the applications and the obtained results. In this thesis we studied 3D semantic analysis of range data. Under this broad title we address segmentation of range...
Citation Formats
K. B. Loğoğlu, “Spatial 3D local descriptors for object recognition in RGB-D images,” Ph.D. - Doctoral Program, Middle East Technical University, 2016.