Metric learning using deep recurrent networks for visual clustering and retrieval

Download
2018
Can, Oğul
Learning an image similarity metric plays a key role in visual analysis, especially for the cases where a training set contains a large number of hard negative samples that are difficult to distinguish from other classes. Due to the outstanding results of the deep metric learning on visual tasks, such as image clustering and retrieval, selecting a proper loss function and a sampling method becomes a central issue to boost the performance. The existing metric learning approaches have two significant drawbacks; inadequate mini-batch sampling and disregarding higher-order relations between data samples. In this thesis, two novel methods are proposed to alleviate these deficiencies. At first, a novel loss function is introduced to identify multiple similar examples in a local neighborhood. Moreover, a novel batch construction method is presented to select representative hard negatives. The training of a deep network is achieved by using this novel cost function through the proposed batch construction approach. In order to consider higher-order relations between samples, a novel deep metric learning framework that contains recurrent neural networks architecture is proposed. Extensive experimental results on three publicly available datasets show that proposed approaches yield competitive or better performance in comparison with state-of-the-art metric learning methods.

Suggestions

Hierarchical representations for visual object tracking by detection
Beşbınar, Beril; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2015)
Deep learning is the discipline of training computational models that are composed of multiple layers and these methods have improved the state of the art in many areas such as visual object detection, scene understanding or speech recognition. Rebirth of these fairly old computational models is usually related to the availability of large datasets, increase in the computational power of current hardware and more recently proposed unsupervised training methods that exploit the internal structure of very lar...
Evaluation of visual quality metrics
Ölgün, Ramazan Ferhat; Akar, Gözde; Department of Electrical and Electronics Engineering (2011)
The aim of this study is to work on the visual quality metrics that are widely accepted in literature, to evaluate them on different distortion types and to give a comparison of overall performances in terms of prediction accuracy, monotonicity, consistency and complexity. The algorithms behind the quality metrics in literature and parameters used for quality metric performance evaluations are studied. This thesis also includes the explanation of Human Visual System, classification of visual quality metrics...
Visual object detection and tracking using local convolutional context features and recurrent neural networks
Kaya, Emre Can; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2018)
Visual object detection and tracking are two major problems in computer vision which have important real-life application areas. During the last decade, Convolutional Neural Networks (CNNs) have received significant attention and outperformed methods that rely on handcrafted representations in both detection and tracking. On the other hand, Recurrent Neural Networks (RNNs) are commonly preferred for modeling sequential data such as video sequences. A novel convolutional context feature extension is introduc...
Machine learning methods for opponent modeling in games of imperfect information
Şirin, Volkan; Yarman Vural, Fatoş Tunay; Department of Computer Engineering (2012)
This thesis presents a machine learning approach to the problem of opponent modeling in games of imperfect information. The efficiency of various artificial intelligence techniques are investigated in this domain. A sequential game is called imperfect information game if players do not have all the information about the current state of the game. A very popular example is the Texas Holdem Poker, which is used for realization of the suggested methods in this thesis. Opponent modeling is the system that enabl...
Object recognition and segmentation via shape models
Altınoklu, Metin Burak; Ulusoy, İlkay; Tarı, Zehra Sibel; Department of Electrical and Electronics Engineering (2016)
In this thesis, the problem of object detection, recognition and segmentation in computer vision is addressed with shape based methods. An efficient object detection method based on a sparse skeleton has been proposed. The proposed method is an improved chamfer template matching method for recognition of articulated objects. Using a probabilistic graphical model structure, shape variation is represented in a skeletal shape model, where nodes correspond to parts consisting of lines and edges correspond to pa...
Citation Formats
O. Can, “Metric learning using deep recurrent networks for visual clustering and retrieval,” M.S. - Master of Science, Middle East Technical University, 2018.