Multi-fold MIL Training for Weakly Supervised Object Localization

Download
2014-01-01
Cinbiş, Ramazan Gökberk
Schmid, Cordelia
Object category localization is a challenging problem in computer vision. Standard supervised training requires bounding box annotations of object instances. This time-consuming annotation process is sidestepped in weakly supervised learning. In this case, the supervised information is restricted to binary labels that indicate the absence/presence of object instances in the image, without their locations. We follow a multiple-instance learning approach that iteratively trains the detector and infers the object locations in the positive training images. Our main contribution is a multi-fold multiple instance learning procedure, which prevents training from prematurely locking onto erroneous object locations. This procedure is particularly important when high-dimensional representations, such as the Fisher vectors, are used. We present a detailed experimental evaluation using the PASCAL VOC 2007 dataset. Compared to state-of-the-art weakly supervised detectors, our approach better localizes objects in the training images, which translates into improved detection performance.
27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Suggestions

Weakly Supervised Object Localization with Multi-Fold Multiple Instance Learning
Cinbiş, Ramazan Gökberk; Schmid, Cordelia (2017-01-01)
Object category localization is a challenging problem in computer vision. Standard supervised training requires bounding box annotations of object instances. This time-consuming annotation process is sidestepped in weakly supervised learning. In this case, the supervised information is restricted to binary labels that indicate the absence/presence of object instances in the image, without their locations. We follow a multiple-instance learning approach that iteratively trains the detector and infers the obj...
Object Detection with Minimal Supervision
Demirel, Berkan; Cinbiş, Ramazan Gökberk; İkizler Cinbiş, Nazlı; Department of Computer Engineering (2023-1-18)
Object detection is considered one of the most challenging problems in computer vision since it requires correctly predicting both the object classes and their locations. In the literature, object detection approaches are usually trained in a fully-supervised manner, with a large amount of annotated data for all classes. Since data annotation is costly in terms of both time and labor, there are also alternative object detection methods, such as weakly supervised or mixed supervised learning to reduce these ...
AN ANALYSIS ON THE EFFECT OF DYNAMIC RANGE ON OBJECT DETECTION WITH DEEP NEURAL NETWORKS
Koçdemir, İsmail Hakkı; Kalkan, Sinan; Alatan, Abdullah Aydın; Department of Computer Engineering (2021-10-8)
An important problem in computer vision, particularly in object detection, is being able to perceive objects even under challenging illumination conditions. Being robust to such conditions is especially important in applications, such as autonomous driving. Despite the significance of the problem, existing autonomous driving systems use deep object detection networks with low-dynamic range (LDR) images during both the training phase and the testing phase. In this thesis, we investigate whether high-dynamic ...
Utilization of dense depth information for monoview object detection and instance segmentation
Çakırgöz, Çağlayan Can; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2022-5-10)
Object detection aims for detecting objects of certain classes in an image by bounding them in rectangular boxes whereas instance segmentation tries to detect objects in pixel level. Deep learning techniques, which have shown great improvements over the last decade, are utilized in these topics as well, and a significant success is achieved against the traditional methods. Similar improvements can be observed in dense depth estimation which deals with deducing dense information of a scene from a single imag...
Scale invariant representation of 2 5D data
AKAGUNDUZ, Erdem; ULUSOY PARNAS, İLKAY; BOZKURT, Nesli; Halıcı, Uğur (2007-06-13)
In this paper, a scale and orientation invariant feature representation for 2.5D objects is introduced, which may be used to classify, detect and recognize objects even under the cases of cluttering and/or occlusion. With this representation a 2.5D object is defined by an attributed graph structure, in which the nodes are the pit and peak regions on the surface. The attributes of the graph are the scales, positions and the normals of these pits and peaks. In order to detect these regions a "peakness" (or pi...
Citation Formats
R. G. Cinbiş and C. Schmid, “Multi-fold MIL Training for Weakly Supervised Object Localization,” presented at the 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, 2014, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/57653.