Rescoring detections based on contextual scores in object detection

Download
2019
Zorlu, Ersan Vural
To detect objects in an image, current state-of-the-art object detectors firstly definecandidate object locations, and then classify each of them into one of the predefinedcategories or as background. They do so by using the visual features extracted locallyfrom the candidate locations; omitting the rich contextual information embedded inthe whole image. Contextual information can be utilized to complement the informa-tion extracted locally and thereby to improve object detection accuracy. Researchershave proposed many models that exploit scene-level and/or instance-level context byusing non-local features from the same image. In this work, we propose models toimprove object detection by utilizing contextual information embedded in the con-fidence scores of detections in the whole image without using any visual features.Our models use object-to-object spatial and scale-related relationships and work as apost-processing step that can be plugged into any object detector. Specifically, for areference detection output by the base object detector, our model first defines a varietyof spatial and scale-based regions relative to the location of the reference detection.Then, each of these regions is summarized by the confidence scores of detectionsv inside it. Next, the confidence scores of the reference detection and the contextualconfidence scores are processed by our models. We propose three variants based onmultilayer perceptrons. We evaluate our models in conjunction with the state-of-the-art RetinaNet object detector on the widely used MSCOCO benchmark dataset, wherewe show that our models improve average precision by up to %1.8 points.

Suggestions

Object Detection with Convolutional Context Features
Kaya, Emre Can; Alatan, Abdullah Aydın (2017-01-01)
A novel extension to Huh B-ESA object detection algorithm is proposed in order to learn convolutional context features for determining boundaries of objects better. For input images, the hypothesis windows and their context around those windows are learned through convolutional layers as two parallel networks. The resulting object and context feature maps are combined in such a way that they preserve their spatial relationship. The proposed algorithm is trained and evaluated on PASCAL VOC 2007 detection ben...
Scale invariant representation of 2 5D data
AKAGUNDUZ, Erdem; ULUSOY PARNAS, İLKAY; BOZKURT, Nesli; Halıcı, Uğur (2007-06-13)
In this paper, a scale and orientation invariant feature representation for 2.5D objects is introduced, which may be used to classify, detect and recognize objects even under the cases of cluttering and/or occlusion. With this representation a 2.5D object is defined by an attributed graph structure, in which the nodes are the pit and peak regions on the surface. The attributes of the graph are the scales, positions and the normals of these pits and peaks. In order to detect these regions a "peakness" (or pi...
Fine-Grained Object Recognition and Zero-Shot Learning in Remote Sensing Imagery
Sumbul, Gencer; Cinbiş, Ramazan Gökberk; Aksoy, Selim (2018-02-01)
Fine-grained object recognition that aims to identify the type of an object among a large number of subcategories is an emerging application with the increasing resolution that exposes new details in image data. Traditional fully supervised algorithms fail to handle this problem where there is low betweenclass variance and high within-class variance for the classes of interest with small sample sizes. We study an even more extreme scenario named zero-shot learning (ZSL) in which no training example exists f...
IMPROVING PROPOSAL-BASED OBJECT DETECTION USING CONVOLUTIONAL CONTEXT FEATURES
Kaya, Emre Can; Alatan, Abdullah Aydın (2018-10-10)
A novel extension to proposal-based detection is proposed in order to learn convolutional context features for determining boundaries of objects better. Objects and their context are aimed to be learned through parallel convolutional stages. The resulting object and context feature maps are combined in such a way that they preserve their spatial relationship. The proposed algorithm is trained and evaluated on PASCAL VOC 2007 detection benchmark dataset and yielded improvements in performance over baseline, ...
Multisource region attention network for fine-grained object recognition in remote sensing imagery
Sümbül, Gencer; Cinbiş, Ramazan Gökberk; Aksoy, Selim (Institute of Electrical and Electronics Engineers (IEEE), 2019-07)
Fine-grained object recognition concerns the identification of the type of an object among a large number of closely related subcategories. Multisource data analysis that aims to leverage the complementary spectral, spatial, and structural information embedded in different sources is a promising direction toward solving the fine-grained recognition problem that involves low between-class variance, small training set sizes for rare classes, and class imbalance. However, the common assumption of coregistered ...
Citation Formats
E. V. Zorlu, “Rescoring detections based on contextual scores in object detection,” Thesis (M.S.) -- Graduate School of Natural and Applied Sciences. Computer Engineering., Middle East Technical University, 2019.