Visual semantic segmentation with diminished supervision

Download
2023-8-25
Ergül, Mustafa
Despite the promising performance of conventional fully supervised algorithms, semantic segmentation has remained an important, yet challenging task. Recent approaches have attempted to exploit the capabilities of deep learning techniques for image recognition to tackle structured prediction tasks such as semantic segmentation. However, due to the very invariance properties that make deep CNNs good for high-level tasks such as classification, visual delineation capacities for deep learning techniques are limited. To solve this problem, we will investigate a new form of deep neural network that integrates the strengths of Convolutional Neural Net works (CNNs) and Conditional Random Fields (CRFs)-based probabilistic graphical modeling in the scope of this thesis. Besides, most state-of-the-art methods based on deep learning rely on a sufficiently huge amount of annotated samples in training. However, there are not enough labeled samples for this task because pixel-level (or superpixel-level) annotation is time-consuming and labor-intensive (15-60 minutes for just one image). Due to the limited availability of complete annotations, it is of great interest to design solutions for semantic segmentation that take into account weakly labeled data, which is readily available at a much larger scale. In this dissertation, we propose a unified approach that incorporates various forms of weak supervision – image-level tags, bounding boxes, or partial labels as well as pixel-level labels – to produce pixel-wise labeling.
Citation Formats
M. Ergül, “Visual semantic segmentation with diminished supervision,” Ph.D. - Doctoral Program, Middle East Technical University, 2023.