Generating positive bounding boxes for balanced training of object detectors

2020-03-01
Oksuz, Kemal
Cam, Baris Can
Akbaş, Emre
Kalkan, Sinan
© 2020 IEEE.Two-stage deep object detectors generate a set of regions-of-interest (RoIs) in the first stage, then, in the second stage, identify objects among the proposed RoIs that sufficiently overlap with a ground truth (GT) box. The second stage is known to suffer from a bias towards RoIs that have low intersection-over-union (IoU) with the associated GT boxes. To address this issue, we first propose a sampling method to generate bounding boxes (BB) that overlap with a given reference box more than a given IoU threshold. Then, we use this BB generation method to develop a positive RoI (pRoI) generator that, for the second stage, produces RoIs following any desired spatial or IoU distribution. We show that our pRoI generator is able to simulate other sampling methods for positive examples such as hard example mining and prime sampling. Using our generator as an analysis tool, we show that (i) IoU imbalance has an adverse effect on performance, (ii) hard positive example mining improves the performance only for certain input IoU distributions, and (iii) the imbalance among the foreground classes has an adverse effect on performance and that it can be alleviated at the batch level. Finally, we train Faster R-CNN using our pRoI generator and, compared to conventional training, obtain better or on-par performance for low IoUs and significant improvements when trained for higher IoUs for Pascal VOC and MS COCO datasets. The code is available at: https://github.com/kemaloksuz/BoundingBoxGenerator.

Suggestions

Improvements on one-stage object detection by visual reasoning
Aksoy, Tolga; Halıcı, Uğur; Department of Electrical and Electronics Engineering (2022-5-09)
Current state-of-the-art one-stage object detectors are limited by treating each image region separately without considering possible relations of the objects. This causes dependency solely on high-quality convolutional feature representations for detecting objects successfully. However, this may not be possible sometimes due to some challenging conditions. In this thesis, a new architecture is proposed for one-stage object detection that reasons the relations of the image regions by using self-attention. T...
Training object detectors by directly optimizing lrp metric
Çam, Barış Can; Akbaş, Emre; Kalkan, Sinan; Department of Computer Engineering (2020-9)
This thesis focuses on training deep object detection networks by directly optimizing the localisation-recall-precision (LRP) performance metric that can evaluate classification and localisation performance of an object detector in a unified manner (Oksuz et al., 2018). To achieve this goal, unlike the commonly used linear weighting approach, we aim to implicitly optimize the LRP metric first by using a bounded localisation loss from previous works and proposing a loss function that can bound the range ...
Infrared Object Classification Using Decision Tree Based Deep Neural Networks
Gundogdu, Erhan; Koç, Aykut; Alatan, Abdullah Aydın (2016-05-19)
In this work, we focus on the problem of infrared (IR) object classification by dividing the object appearance space hierarchically with a binary decision tree structure. Specially designed features of the object appearances make the binary decisions at each node of the tree. These features are extracted using a fully connected deep neural network. At each node of the tree, we train individual deep CNNs such that each node specializes in its corresponding subspace. The proposed method is tested in our gener...
Modeling Voxel Connectivity for Brain Decoding
Onal, Itir; Ozay, Mete; Yarman Vural, Fatoş Tunay (2015-06-12)
The massively dynamic nature of human brain cannot be represented by considering only a collection of voxel intensity values obtained from fMRI measurements. It has been observed that the degree of connectivity among voxels provide important information for modeling cognitive activities. Moreover, spatially close voxels act together to generate similar BOLD responses to the same stimuli. In this study, we propose a local mesh model, called Local Mesh Model with Temporal Measurements (LMM-TM), to first estim...
Identification of heavy-flavour jets with the CMS detector in pp collisions at 13 TeV
Sirunyan, A.M.; et. al. (IOP Publishing, 2018-05-01)
Many measurements and searches for physics beyond the standard model at the LHC rely on the efficient identification of heavy-flavour jets, i.e. jets originating from bottom or charm quarks. In this paper, the discriminating variables and the algorithms used for heavy-flavour jet identification during the first years of operation of the CMS experiment in proton-proton collisions at a centre-of-mass energy of 13 TeV, are presented. Heavy-flavour jet identification algorithms have been improved compared to th...
Citation Formats
K. Oksuz, B. C. Cam, E. Akbaş, and S. Kalkan, “Generating positive bounding boxes for balanced training of object detectors,” 2020, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/54409.