Towards Uncertainty-Aware Disentangled Representations

Özyeğin, Sezai Artun
In many computer vision tasks, not every part of an object of interest is always visible because of challenges like occlusion, viewpoint and pose variation. One approach to these kinds of challenges is separating the representation so that they would correspond to different regions. In this thesis, we tackle the problem of obtaining disentangled representations while estimating the uncertainty of each factor to assess its availability. Representations are disentangled using a factor-related supervised task and by using an adversarial loss, unrelated information is removed. Uncertainty of factors are estimated using loss attenuation over the same factor-related task. We try several methods to integrate uncertainty values into both the training procedure and the decision making process during test time to make the model more robust to unavailable parts. The experiments are conducted over a toy dataset and the person re-identification task (namely, the Market-1501 dataset) which can benefit from disentangled representations.


Shape descriptors based on intersection consistency and global binary patterns
Sivri, Erdal; Kalkan, Sinan; Department of Computer Engineering (2012)
Shape description is an important problem in computer vision because most vision tasks that require comparing or matching visual entities rely on shape descriptors. In this thesis, two novel shape descriptors are proposed, namely Intersection Consistency Histogram (ICH) and Global Binary Patterns (GBP). The former is based on a local regularity measure called Intersection Consistency (IC), which determines whether edge pixels in an image patch point towards the center or not. The second method, called Globa...
Perceptual quality preserving adversarial attacks
Aksoy, Bilgin; Temizel, Alptekin; Department of Modeling and Simulation (2019)
Deep learning is used in various succesful computer vision applications such as image classification. Deep neural networks (DNN) especially convolutional neural networks have reached above human level accuracy rates for image classification tasks. While DNNs have solved the image classification task and enabled its use in many practical applications, recent research has unveiled some properties which could degrade their performance. Adversarial images are samples that are intentionally modified by adding no...
Efficient detection and tracking of salient regions for visual processing on mobile platforms
Serhat, Gülhan; Saranlı, Afşar; Department of Electrical and Electronics Engineering (2009)
Visual Attention is an interesting concept that constantly widens its application areas in the field of image processing and computer vision. The main idea of visual attention is to find the locations on the image that are visually attractive. In this thesis, the visually attractive regions are extracted and tracked in video sequences coming from the vision systems of mobile platforms. First, the salient regions are extracted in each frame and a feature vector is constructed for each one. Then Scale Invaria...
Effect of Visual Context Information for Super Resolution Problems
Akar, Gözde; Aykut, Ekin; Cengiz, Baran; Bocek, Kadircan (2019-04-26)
In this study, the effect of visual context information to the performance of learning-based techniques for the super resolution problem is analyzed. Beside the interpretation of the experimental results in detail, its theoretical reasoning is also achieved in the paper. For the experiments, two different visual datasets composed of natural and remote sensing scenes are utilized. From the experimental results, we observe that keeping visual context information in the course of parameter learning for convolu...
Articulated motion analysis via axis-based representation
Erdem, Sezen; Tarı, Zehra Sibel (2007-01-01)
Human motion analysis is one of the active research areas in computer vision. The trend shifts from computing motion fields to determining actions. We present an action coding scheme based on a trajectory of features defined with respect to a part based coordinate system. The method does not require prior human model or special motion capture hardware. The features are extracted from images segmented in the form of silhouettes. The feature extraction step ignores 3D effects such as self occlusions or motion...
Citation Formats
S. A. Özyeğin, “Towards Uncertainty-Aware Disentangled Representations,” M.S. - Master of Science, Middle East Technical University, 2021.