Multi-fold MIL Training for Weakly Supervised Object Localization

Download
2014-01-01
Cinbiş, Ramazan Gökberk
Schmid, Cordelia
Object category localization is a challenging problem in computer vision. Standard supervised training requires bounding box annotations of object instances. This time-consuming annotation process is sidestepped in weakly supervised learning. In this case, the supervised information is restricted to binary labels that indicate the absence/presence of object instances in the image, without their locations. We follow a multiple-instance learning approach that iteratively trains the detector and infers the object locations in the positive training images. Our main contribution is a multi-fold multiple instance learning procedure, which prevents training from prematurely locking onto erroneous object locations. This procedure is particularly important when high-dimensional representations, such as the Fisher vectors, are used. We present a detailed experimental evaluation using the PASCAL VOC 2007 dataset. Compared to state-of-the-art weakly supervised detectors, our approach better localizes objects in the training images, which translates into improved detection performance.
27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Suggestions

Weakly Supervised Object Localization with Multi-Fold Multiple Instance Learning
Cinbiş, Ramazan Gökberk; Schmid, Cordelia (2017-01-01)
Object category localization is a challenging problem in computer vision. Standard supervised training requires bounding box annotations of object instances. This time-consuming annotation process is sidestepped in weakly supervised learning. In this case, the supervised information is restricted to binary labels that indicate the absence/presence of object instances in the image, without their locations. We follow a multiple-instance learning approach that iteratively trains the detector and infers the obj...
AN ANALYSIS ON THE EFFECT OF DYNAMIC RANGE ON OBJECT DETECTION WITH DEEP NEURAL NETWORKS
Koçdemir, İsmail Hakkı; Kalkan, Sinan; Alatan, Abdullah Aydın; Department of Computer Engineering (2021-10-8)
An important problem in computer vision, particularly in object detection, is being able to perceive objects even under challenging illumination conditions. Being robust to such conditions is especially important in applications, such as autonomous driving. Despite the significance of the problem, existing autonomous driving systems use deep object detection networks with low-dynamic range (LDR) images during both the training phase and the testing phase. In this thesis, we investigate whether high-dynamic ...
Utilization of dense depth information for monoview object detection and instance segmentation
Çakırgöz, Çağlayan Can; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2022-5-10)
Object detection aims for detecting objects of certain classes in an image by bounding them in rectangular boxes whereas instance segmentation tries to detect objects in pixel level. Deep learning techniques, which have shown great improvements over the last decade, are utilized in these topics as well, and a significant success is achieved against the traditional methods. Similar improvements can be observed in dense depth estimation which deals with deducing dense information of a scene from a single imag...
Scale invariant representation of 2 5D data
AKAGUNDUZ, Erdem; ULUSOY PARNAS, İLKAY; BOZKURT, Nesli; Halıcı, Uğur (2007-06-13)
In this paper, a scale and orientation invariant feature representation for 2.5D objects is introduced, which may be used to classify, detect and recognize objects even under the cases of cluttering and/or occlusion. With this representation a 2.5D object is defined by an attributed graph structure, in which the nodes are the pit and peak regions on the surface. The attributes of the graph are the scales, positions and the normals of these pits and peaks. In order to detect these regions a "peakness" (or pi...
Real-time detection and tracking of human eyes in video sequences
Savaş, Zafer; Halıcı, Uğur; Department of Electrical and Electronics Engineering (2005)
Robust, non-intrusive human eye detection problem has been a fundamental and challenging problem for computer vision area. Not only it is a problem of its own, it can be used to ease the problem of finding the locations of other facial features for recognition tasks and human-computer interaction purposes as well. Many previous works have the capability of determining the locations of the human eyes but the main task in this thesis is not only a vision system with eye detection capability; Our aim is to des...
Citation Formats
R. G. Cinbiş and C. Schmid, “Multi-fold MIL Training for Weakly Supervised Object Localization,” presented at the 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, 2014, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/57653.