Visual object detection and tracking using local convolutional context features and recurrent neural networks

Kaya, Emre Can
Visual object detection and tracking are two major problems in computer vision which have important real-life application areas. During the last decade, Convolutional Neural Networks (CNNs) have received significant attention and outperformed methods that rely on handcrafted representations in both detection and tracking. On the other hand, Recurrent Neural Networks (RNNs) are commonly preferred for modeling sequential data such as video sequences. A novel convolutional context feature extension is introduced to a proposal-based detection scheme for improving object detection performance. A comprehensive experimental study is conducted to demonstrate the effectiveness of this newly proposed approach. On the tracking side, the effect of several design choices is investigated for an RNN-based tracking algorithm by the help of comparative experiments. Finally, the proposed context feature based method is combined with the RNN-based tracking framework and a joint detection-tracking framework that outperforms the baseline model is proposed.


Object recognition and segmentation via shape models
Altınoklu, Metin Burak; Ulusoy, İlkay; Tarı, Zehra Sibel; Department of Electrical and Electronics Engineering (2016)
In this thesis, the problem of object detection, recognition and segmentation in computer vision is addressed with shape based methods. An efficient object detection method based on a sparse skeleton has been proposed. The proposed method is an improved chamfer template matching method for recognition of articulated objects. Using a probabilistic graphical model structure, shape variation is represented in a skeletal shape model, where nodes correspond to parts consisting of lines and edges correspond to pa...
Topcu, Osman; Orguner, Umut; Alatan, Abdullah Aydın; ERCAN, ALİ ÖZER (2014-04-25)
Visual tracking has an important place among computer vision applications. Visual tracking with particle filters is a well-known methodology. The performance of particle filters is dependent on efficient sampling of the state space, which in turn, is dependent on number of particles. In this paper, Rao-Blackwell technique is applied to particle filters to improve sampling efficiency. Both algorithms are applied to people tracking problem. Under the same circumstances, the resulting algorithm is demonstrated...
Shape descriptors based on intersection consistency and global binary patterns
Sivri, Erdal; Kalkan, Sinan; Department of Computer Engineering (2012)
Shape description is an important problem in computer vision because most vision tasks that require comparing or matching visual entities rely on shape descriptors. In this thesis, two novel shape descriptors are proposed, namely Intersection Consistency Histogram (ICH) and Global Binary Patterns (GBP). The former is based on a local regularity measure called Intersection Consistency (IC), which determines whether edge pixels in an image patch point towards the center or not. The second method, called Globa...
Deep Hierarchies in the Primate Visual Cortex: What Can We Learn for Computer Vision?
KRÜGER, Norbert; JANSSEN, Peter; Kalkan, Sinan; LAPPE, Markus; LEONARDİS, Ales; PİATER, Justus; Rodriguez-Sanchez, Antonio J.; WİSKOTT, Laurenz (Institute of Electrical and Electronics Engineers (IEEE), 2013-08-01)
Computational modeling of the primate visual system yields insights of potential relevance to some of the challenges that computer vision is facing, such as object recognition and categorization, motion detection and activity recognition, or vision-based navigation and manipulation. This paper reviews some functional principles and structures that are generally thought to underlie the primate visual cortex, and attempts to extract biological principles that could further advance computer vision research. Or...
Data-driven image captioning via salient region discovery
Kilickaya, Mert; Akkuş, Burak Kerim; Çakıcı, Ruket; Erdem, Aykut; Erdem, Erkut; İKİZLER CİNBİŞ, NAZLI (Institution of Engineering and Technology (IET), 2017-09-01)
n the past few years, automatically generating descriptions for images has attracted a lot of attention in computer vision and natural language processing research. Among the existing approaches, data-driven methods have been proven to be highly effective. These methods compare the given image against a large set of training images to determine a set of relevant images, then generate a description using the associated captions. In this study, the authors propose to integrate an object-based semantic image r...
Citation Formats
E. C. Kaya, “Visual object detection and tracking using local convolutional context features and recurrent neural networks,” M.S. - Master of Science, Middle East Technical University, 2018.