Multi-fold MIL Training for Weakly Supervised Object Localization

Download

index.pdf

Date

2014-01-01

Author

Cinbiş, Ramazan Gökberk
Schmid, Cordelia

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

155
views

171
downloads

Object category localization is a challenging problem in computer vision. Standard supervised training requires bounding box annotations of object instances. This time-consuming annotation process is sidestepped in weakly supervised learning. In this case, the supervised information is restricted to binary labels that indicate the absence/presence of object instances in the image, without their locations. We follow a multiple-instance learning approach that iteratively trains the detector and infers the object locations in the positive training images. Our main contribution is a multi-fold multiple instance learning procedure, which prevents training from prematurely locking onto erroneous object locations. This procedure is particularly important when high-dimensional representations, such as the Fisher vectors, are used. We present a detailed experimental evaluation using the PASCAL VOC 2007 dataset. Compared to state-of-the-art weakly supervised detectors, our approach better localizes objects in the training images, which translates into improved detection performance.

Subject Keywords

Training, Detectors, Standards, Object detection, Feature extraction, Vectors, Support vector machines

URI

https://hdl.handle.net/11511/57653

DOI

https://doi.org/10.1109/cvpr.2014.309

Conference Name

27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Collections

Department of Computer Engineering, Conference / Seminar

Suggestions

OpenMETU
Core

Weakly Supervised Object Localization with Multi-Fold Multiple Instance Learning Cinbiş, Ramazan Gökberk; Schmid, Cordelia (2017-01-01) Object category localization is a challenging problem in computer vision. Standard supervised training requires bounding box annotations of object instances. This time-consuming annotation process is sidestepped in weakly supervised learning. In this case, the supervised information is restricted to binary labels that indicate the absence/presence of object instances in the image, without their locations. We follow a multiple-instance learning approach that iteratively trains the detector and infers the obj...
Object Detection with Minimal Supervision Demirel, Berkan; Cinbiş, Ramazan Gökberk; İkizler Cinbiş, Nazlı; Department of Computer Engineering (2023-1-18) Object detection is considered one of the most challenging problems in computer vision since it requires correctly predicting both the object classes and their locations. In the literature, object detection approaches are usually trained in a fully-supervised manner, with a large amount of annotated data for all classes. Since data annotation is costly in terms of both time and labor, there are also alternative object detection methods, such as weakly supervised or mixed supervised learning to reduce these ...
AN ANALYSIS ON THE EFFECT OF DYNAMIC RANGE ON OBJECT DETECTION WITH DEEP NEURAL NETWORKS Koçdemir, İsmail Hakkı; Kalkan, Sinan; Alatan, Abdullah Aydın; Department of Computer Engineering (2021-10-8) An important problem in computer vision, particularly in object detection, is being able to perceive objects even under challenging illumination conditions. Being robust to such conditions is especially important in applications, such as autonomous driving. Despite the significance of the problem, existing autonomous driving systems use deep object detection networks with low-dynamic range (LDR) images during both the training phase and the testing phase. In this thesis, we investigate whether high-dynamic ...
Utilization of dense depth information for monoview object detection and instance segmentation Çakırgöz, Çağlayan Can; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2022-5-10) Object detection aims for detecting objects of certain classes in an image by bounding them in rectangular boxes whereas instance segmentation tries to detect objects in pixel level. Deep learning techniques, which have shown great improvements over the last decade, are utilized in these topics as well, and a significant success is achieved against the traditional methods. Similar improvements can be observed in dense depth estimation which deals with deducing dense information of a scene from a single imag...
Scale invariant representation of 2 5D data AKAGUNDUZ, Erdem; ULUSOY PARNAS, İLKAY; BOZKURT, Nesli; Halıcı, Uğur (2007-06-13) In this paper, a scale and orientation invariant feature representation for 2.5D objects is introduced, which may be used to classify, detect and recognize objects even under the cases of cluttering and/or occlusion. With this representation a 2.5D object is defined by an attributed graph structure, in which the nodes are the pit and peak regions on the surface. The attributes of the graph are the scales, positions and the normals of these pits and peaks. In order to detect these regions a "peakness" (or pi...

Citation Formats

R. G. Cinbiş and C. Schmid, “Multi-fold MIL Training for Weakly Supervised Object Localization,” presented at the 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, 2014, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/57653.