Semi-supervised generative guidance for zero-shot semantic segmentation

Download
2022-1
Önem, Abdullah Cem
Collecting fully-annotated data to train deep networks for semantic image segmentation can be prohibitively costly due to difficulty of making pixel-by-pixel annotations. In this context, zero-shot learning based formulations relax the labelled data requirements by enabling the recognition of classes without training examples. Recent studies on zero-shot learning of semantic segmentation models, however, highlight the difficulty of the problem. This thesis proposes techniques towards improving zero-shot generalization to unseen classes by exploiting unlabelled images. The main goal is to train a generative image model conditioned on zero-shot segmentation predictions in a semi-supervised manner, and use the feedback from the generative model to the segmentation based conditioning inputs as a guidance. In this manner, the zero-shot segmentation model is encouraged to make more accurate predictions so that it provides more informative conditional inputs to the generative model. To further improve the training dynamics of the generative model, the generative model is trained in the feature space provided by the early convolutional layer(s) of the segmentation architecture, overall forming a high-level to low-level generative feedback loop. Following the state-of-the-art, the approach is experimentally evaluated using the COCO-Stuff dataset.

Suggestions

An image retrieval system based on region classification
Ozcanli, OC; Yarman-Vural, F (2004-01-01)
In this study, a content based image retrieval (CBIR) system to query the objects in an image database is proposed. Images are represented as collections of regions after being segmented with Normalized Cuts algorithm. MPEG-7 content descriptors are used to encode regions in a 239-dimensional feature space. User of the proposed CBIR system decides which objects to query and labels exemplar regions to train the system using a graphical interface. Fuzzy ARTMAP algorithm is used to learn the mapping between fe...
Recursive shortest spanning tree algorithms for image segmentation
Bayramoglu, NY; Bazlamaçcı, Cüneyt Fehmi (2005-11-24)
Image segmentation has an important role in image processing and the speed of the segmentation algorithm may become a drawback for some applications. This study analyzes the run time performances of some variations of the Recursive Shortest Spanning Tree Algorithm (RSST) and proposes simple but effective modifications on these algorithms to improve their speeds. In addition, the effect of link weight cost function on the run time performance and the segmentation quality is examined. For further improvement ...
A Graph-Based Approach for Video Scene Detection
Sakarya, Ufuk; Telatar, Zjya (2008-04-22)
In this paper, a graph-based method for video scene detection is proposed. The method is based on a weighted undirected graph. Each shot is a vertex on the graph. Edge weights among the vertices are evaluated by using spatial and temporal similarities of shots. By using the complete information of the graph, a set of the vertices mostly similar to each other and dissimilar to the others is detected. Temporal continuity constraint is achieved on this set. This set is the first detected video scene. The verti...
Image segmentation with unified region and boundary characteristics within recursive shortest spanning tree
Esen, E.; Alp, Y. K. (2007-06-13)
The lack of boundary information in region based image segmentation algorithms resulted in many hybrid methods that integrate the complementary information sources of region and boundary, in order to increase the segmentation performance. In compliance with this trend, we propose a novel method to unify the region and boundary characteristics within the canonical Recursive Shortest Spanning Tree algorithm. The main idea is to incorporate the boundary information in the distance metric of RSST with minor cha...
Semi-Automatic Annotation For Visual Object Tracking
Köksal, Aybora; Alatan, Abdullah Aydın (2021-11-24)
We propose a semi-automatic bounding box annotation method for visual object tracking by utilizing temporal information with a tracking-by-detection approach. For detection, we use an off-the-shelf object detector which is trained iteratively with the annotations generated by the proposed method, and we perform object detection on each frame independently. We employ Multiple Hypothesis Tracking (MHT) to exploit temporal information and to reduce the number of false-positives which makes it possible to use l...
Citation Formats
A. C. Önem, “Semi-supervised generative guidance for zero-shot semantic segmentation,” M.S. - Master of Science, Middle East Technical University, 2022.