Quantifying and mitigating class imbalance in long-tailed visual recognition

2022-7
Baltacı, Zeynep Sonat
Objects are distributed unevenly in real world, which manifests itself as a long-tailed distribution in realistic visual recognition datasets. Deep learning based approaches trained on such imbalanced datasets using conventional gradient-based training strategies exhibit unfair recognition performances towards classes that are under-represented in the dataset. This so-called class imbalance has been studied in the literature by measuring imbalance via either class frequency or class hardness, and using those measures to mitigate imbalance by sampling, loss weighting or calibration strategies. In this thesis, we argue and empirically show that sample frequency or hardness alone is not sufficient for capturing imbalance among classes. Then we propose a novel measure based on predictive uncertainty of a trained deep network and demonstrate that it can capture imbalance better than existing approaches. Finally, we incorporate our measure to existing imbalance mitigation methods: loss reweighting, resampling, margin-based methods, and two-stage training. We show that predictive uncertainty-based methods improve over or perform on par with existing baselines on long-tailed datasets CIFAR-10-LT, CIFAR-100-LT and ImageNet-LT.

Suggestions

Continuous dimensionality characterization of image structures
Felsberg, Michael; Kalkan, Sinan; Kruger, Norbert (Elsevier BV, 2009-05-04)
Intrinsic dimensionality is a concept introduced by statistics and later used in image processing to measure the dimensionality of a data set. In this paper, we introduce a continuous representation of the intrinsic dimension of an image patch in terms of its local spectrum or, equivalently, its gradient field. By making use of a cone structure and barycentric co-ordinates, we can associate three confidences to the three different ideal cases of intrinsic dimensions corresponding to homogeneous image patche...
Investigation of effect of design and operating parameters on acoustophoretic particle separation via 3D device-level simulations
Sahin, Mehmet Akif; ÇETİN, BARBAROS; Özer, Mehmet Bülent (Springer Science and Business Media LLC, 2019-12-16)
In the present study, a 3D device-level numerical model is implemented via finite element method to assess the effects of design and operating parameters on the separation performance of a microscale acoustofluidic device. Elastodynamic equations together with electromechanical coupling at the piezoelectric actuators for the stress field within the solid parts, Helmholtz equation for the acoustic field within fluid, and Navier-Stokes equations for the fluid flow are coupled for the simulations. Once the zer...
Analysis of Face Recognition Algorithms for Online and Automatic Annotation of Personal Videos
Yılmaztürk, Mehmet; Ulusoy Parnas, İlkay; Çiçekli, Fehime Nihan (Springer, Dordrecht; 2010-05-08)
Different from previous automatic but offline annotation systems, this paper studies automatic and online face annotation for personal videos/episodes of TV series considering Nearest Neighbourhood, LDA and SVM classification with Local Binary Patterns, Discrete Cosine Transform and Histogram of Oriented Gradients feature extraction methods in terms of their recognition accuracies and execution times. The best performing feature extraction method and the classifier pair is found out to be SVM classification...
Data-driven image captioning via salient region discovery
Kilickaya, Mert; Akkuş, Burak Kerim; Çakıcı, Ruket; Erdem, Aykut; Erdem, Erkut; İKİZLER CİNBİŞ, NAZLI (Institution of Engineering and Technology (IET), 2017-09-01)
n the past few years, automatically generating descriptions for images has attracted a lot of attention in computer vision and natural language processing research. Among the existing approaches, data-driven methods have been proven to be highly effective. These methods compare the given image against a large set of training images to determine a set of relevant images, then generate a description using the associated captions. In this study, the authors propose to integrate an object-based semantic image r...
Fine-Grained Object Recognition and Zero-Shot Learning in Remote Sensing Imagery
Sumbul, Gencer; Cinbiş, Ramazan Gökberk; Aksoy, Selim (2018-02-01)
Fine-grained object recognition that aims to identify the type of an object among a large number of subcategories is an emerging application with the increasing resolution that exposes new details in image data. Traditional fully supervised algorithms fail to handle this problem where there is low betweenclass variance and high within-class variance for the classes of interest with small sample sizes. We study an even more extreme scenario named zero-shot learning (ZSL) in which no training example exists f...
Citation Formats
Z. S. Baltacı, “Quantifying and mitigating class imbalance in long-tailed visual recognition,” M.S. - Master of Science, Middle East Technical University, 2022.