Identifying and addressing imbalance problems in visual detection

Download

Identifying and Addressing Imbalance Problems in Visual Detection.pdf

Date

2021-5

Author

Öksüz, Kemal

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

1045
views

1012
downloads

This thesis has two aims: (Aim 1) Identifying imbalance problems in visual detection, and (Aim 2) addressing these problems using loss functions based on performance measures. For Aim 1, we present a comprehensive review of the imbalance problems in object detection including a problem-based taxonomy and a detailed discussion for each problem with its solutions and open issues. To achieve Aim 2, we identify two challenges: (i) Average Precision (AP), the common performance measure, has certain drawbacks. To remedy them, we propose Localisation Recall Precision (LRP) Error as a novel performance measure. (ii) Loss functions derived from performance measures are ranking-based functions whose derivatives are zero or infinite, thus, they cannot directly be used with backpropagation. To overcome this, based on perceptron learning, we propose Identity Update, a simple and general optimisation method for ranking-based losses, which provably ensures balance in terms of total gradient mag- nitudes of positives and negatives. Having addressed these challenges, using LRP Error and Identity Update, we propose average LRP Loss and Rank & Sort (RS) Loss for balanced training of visual detectors. We show that our loss functions have the following unique benefits: (i) They are easy-to-tune with a single hyper-parameter, different from common methods with ~7 hyper-parameters on average, (ii) they en- force correlation among sub-tasks of visual detectors (i.e. classification and different localisation tasks), which affects both the remaining detections after Non-Maximum- Suppression and performance measure AP, and (iii) they are applicable to a diverse set of visual detectors (i.e. one-stage, multi-stage, anchor-based, anchor-free, with balanced or severely imbalanced data). As a result of these benefits, for example with RS Loss, we train four object detection and three instance segmentation methods only by tuning the learning rate and consistently improve their performance.

Subject Keywords

Visual detection, Segmentation, Object detection, Performance measure, Average precision, Loss function, Optimisation method, Ranking, Sorting

URI

https://hdl.handle.net/11511/90919

Collections

Graduate School of Natural and Applied Sciences, Thesis

Suggestions

OpenMETU
Core

Improvements on one-stage object detection by visual reasoning Aksoy, Tolga; Halıcı, Uğur; Department of Electrical and Electronics Engineering (2022-5-09) Current state-of-the-art one-stage object detectors are limited by treating each image region separately without considering possible relations of the objects. This causes dependency solely on high-quality convolutional feature representations for detecting objects successfully. However, this may not be possible sometimes due to some challenging conditions. In this thesis, a new architecture is proposed for one-stage object detection that reasons the relations of the image regions by using self-attention. T...
Moving object detection with supervised learning methods Köksal, Aybora; Alatan, Abdullah Aydın; İnce, Kutalmış Gökalp; Department of Electrical and Electronics Engineering (2021-9-7) In this thesis, single target object detection problem is examined. Object detection is a problem that aims defining all of the objects of interest with their pre-defined classes in an image, or in a series of images. The main objective of this thesis is to exploit spatio-temporal information for performance enhancement during moving object detection. To this extent, modern object detection algorithms which are based on CNN architectures are analyzed. Based on this analysis, state-of-the-art techniques whic...
Selection and Fusion of Multiple Stereo Algorithms for Accurate Disparity Segmentation Bilgin, Arda; Ulusoy, İlkay (2009-01-01) Fusion of multiple stereo algorithms is performed in order to obtain accurate disparity segmentation in this study. Reliable disparity map of real-time stereo images is estimated and disparity segmentation is performed for object detection purpose. First, stereo algorithms which have high performance in real-time applications are chosen among the algorithms in the literature and three of them are implemented. Then, the results of these algorithms are fused to gain better performance in disparity estimation....
Numerical method for optimizing stirrer configurations Schafer, M; Karasözen, Bülent; Uludağ, Yusuf; YAPICI, KEREM; Uğur, Ömür (2005-12-15) A numerical approach for the numerical optimization of stirrer configurations is presented. The methodology is based on a parametrized grid generator, a flow solver, and a mathematical optimization tool, which are integrated into an automated procedure. The flow solver is based on the discretization of the Navier-Stokes equations by means of the finite-volume method for block-structured, boundary-fitted grids with multi-grid acceleration and parallelization by grid partitioning. The optimization tool is an ...
Evaluating solutions and solution sets under multiple objectives Köksalan, M.; Karakaya, Gülşah (2021-10-01) In this study we address evaluating solutions and solution sets that are defined by multiple objectives based on a function. Although any function can be used, we focus on mostly weighted Tchebycheff functions that can be used for a variety of purposes when multiple objectives are considered. One such use is to approximate a decision maker's preferences with a Tchebycheff utility function. Different solutions can be evaluated in terms of expected utility conditional on weight values. Another possible use is...

Citation Formats

K. Öksüz, “Identifying and addressing imbalance problems in visual detection,” Ph.D. - Doctoral Program, Middle East Technical University, 2021.