Metric learning using deep recurrent networks for visual clustering and retrieval

Download
2018
Can, Oğul
Learning an image similarity metric plays a key role in visual analysis, especially for the cases where a training set contains a large number of hard negative samples that are difficult to distinguish from other classes. Due to the outstanding results of the deep metric learning on visual tasks, such as image clustering and retrieval, selecting a proper loss function and a sampling method becomes a central issue to boost the performance. The existing metric learning approaches have two significant drawbacks; inadequate mini-batch sampling and disregarding higher-order relations between data samples. In this thesis, two novel methods are proposed to alleviate these deficiencies. At first, a novel loss function is introduced to identify multiple similar examples in a local neighborhood. Moreover, a novel batch construction method is presented to select representative hard negatives. The training of a deep network is achieved by using this novel cost function through the proposed batch construction approach. In order to consider higher-order relations between samples, a novel deep metric learning framework that contains recurrent neural networks architecture is proposed. Extensive experimental results on three publicly available datasets show that proposed approaches yield competitive or better performance in comparison with state-of-the-art metric learning methods.

Suggestions

Hierarchical representations for visual object tracking by detection
Beşbınar, Beril; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2015)
Deep learning is the discipline of training computational models that are composed of multiple layers and these methods have improved the state of the art in many areas such as visual object detection, scene understanding or speech recognition. Rebirth of these fairly old computational models is usually related to the availability of large datasets, increase in the computational power of current hardware and more recently proposed unsupervised training methods that exploit the internal structure of very lar...
Machine learning methods for opponent modeling in games of imperfect information
Şirin, Volkan; Yarman Vural, Fatoş Tunay; Department of Computer Engineering (2012)
This thesis presents a machine learning approach to the problem of opponent modeling in games of imperfect information. The efficiency of various artificial intelligence techniques are investigated in this domain. A sequential game is called imperfect information game if players do not have all the information about the current state of the game. A very popular example is the Texas Holdem Poker, which is used for realization of the suggested methods in this thesis. Opponent modeling is the system that enabl...
Visual object detection and tracking using local convolutional context features and recurrent neural networks
Kaya, Emre Can; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2018)
Visual object detection and tracking are two major problems in computer vision which have important real-life application areas. During the last decade, Convolutional Neural Networks (CNNs) have received significant attention and outperformed methods that rely on handcrafted representations in both detection and tracking. On the other hand, Recurrent Neural Networks (RNNs) are commonly preferred for modeling sequential data such as video sequences. A novel convolutional context feature extension is introduc...
Learning semi-supervised nonlinear embeddings for domain-adaptive pattern recognition
Vural, Elif (null; 2019-05-20)
We study the problem of learning nonlinear data embeddings in order to obtain representations for efficient and domain-invariant recognition of visual patterns. Given observations of a training set of patterns from different classes in two different domains, we propose a method to learn a nonlinear mapping of the data samples from different domains into a common domain. The nonlinear mapping is learnt such that the class means of different domains are mapped to nearby points in the common domain in order to...
Continuous dimensionality characterization of image structures
Felsberg, Michael; Kalkan, Sinan; Kruger, Norbert (Elsevier BV, 2009-05-04)
Intrinsic dimensionality is a concept introduced by statistics and later used in image processing to measure the dimensionality of a data set. In this paper, we introduce a continuous representation of the intrinsic dimension of an image patch in terms of its local spectrum or, equivalently, its gradient field. By making use of a cone structure and barycentric co-ordinates, we can associate three confidences to the three different ideal cases of intrinsic dimensions corresponding to homogeneous image patche...
Citation Formats
O. Can, “Metric learning using deep recurrent networks for visual clustering and retrieval,” M.S. - Master of Science, Middle East Technical University, 2018.