Contextualized scene modeling using boltzmann machines

Download

index.pdf

Date

2018

Author

Bozcan, İlker

Metadata

Show full item record

Item Usage Stats

211
views

161
downloads

Scene modeling is very crucial for robots that need to perceive, reason about and manipulate the objects in their environments. In this thesis, we propose a variant of Boltzmann Machines (BMs) for contextualized scene modeling. Although many computational models have been proposed for the problem, ours is the first to bring together objects, relations, and affordances in a highly-capable generative model. For this end, we introduce a hybrid version of BMs where relations and affordances are introduced with shared, tri-way connections. We evaluate our method in comparison with several baselines on missing or out-of-context object detection, relation estimation, and affordance estimation tasks. Moreover, we also illustrate scene generation capabilities of the model.

Subject Keywords

Robot vision., Image processing., Artificial intelligence.

URI

http://etd.lib.metu.edu.tr/upload/12622335/index.pdf
https://hdl.handle.net/11511/27570

Collections

Graduate School of Natural and Applied Sciences, Thesis

Suggestions

OpenMETU
Core

COSMO: Contextualized scene modeling with Boltzmann Machines Bozcan, Ilker; Kalkan, Sinan (Elsevier BV, 2019-03-01) Scene modeling is very crucial for robots that need to perceive, reason about and manipulate the objects in their environments. In this paper, we adapt and extend Boltzmann Machines (BMs) for contextualized scene modeling. Although there are many models on the subject, ours is the first to bring together objects, relations, and affordances in a highly-capable generative model. For this end, we introduce a hybrid version of BMs where relations and affordances are incorporated with shared, tri-way connections...
Hierarchical representations for visual object tracking by detection Beşbınar, Beril; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2015) Deep learning is the discipline of training computational models that are composed of multiple layers and these methods have improved the state of the art in many areas such as visual object detection, scene understanding or speech recognition. Rebirth of these fairly old computational models is usually related to the availability of large datasets, increase in the computational power of current hardware and more recently proposed unsupervised training methods that exploit the internal structure of very lar...
Object recognition and segmentation via shape models Altınoklu, Metin Burak; Ulusoy, İlkay; Tarı, Zehra Sibel; Department of Electrical and Electronics Engineering (2016) In this thesis, the problem of object detection, recognition and segmentation in computer vision is addressed with shape based methods. An efficient object detection method based on a sparse skeleton has been proposed. The proposed method is an improved chamfer template matching method for recognition of articulated objects. Using a probabilistic graphical model structure, shape variation is represented in a skeletal shape model, where nodes correspond to parts consisting of lines and edges correspond to pa...
Designing Social Cues for Collaborative Robots: The Role of Gaze and Breathing in Human-Robot Collaboration Terzioglu, Yunus; Mutlu, Bilge; Şahin, Erol (2020-01-01) In this paper, we investigate how collaborative robots, or cobots, typically composed of a robotic arm and a gripper carrying out manipulation tasks alongside human coworkers, can be enhanced with HRI capabilities by applying ideas and principles from character animation. To this end, we modified the appearance and behaviors of a cobot, with minimal impact on its functionality and performance, and studied the extent to which these modifications improved its communication with and perceptions by human collab...
Metric learning using deep recurrent networks for visual clustering and retrieval Can, Oğul; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2018) Learning an image similarity metric plays a key role in visual analysis, especially for the cases where a training set contains a large number of hard negative samples that are difficult to distinguish from other classes. Due to the outstanding results of the deep metric learning on visual tasks, such as image clustering and retrieval, selecting a proper loss function and a sampling method becomes a central issue to boost the performance. The existing metric learning approaches have two significant drawback...

Citation Formats

İ. Bozcan, “Contextualized scene modeling using boltzmann machines,” M.S. - Master of Science, Middle East Technical University, 2018.