Object localization can be defined as the task of finding the bounding boxes of objects in a scene. Most of the state-of-the-art approaches utilize meticulously handcrafted training datasets. In this work, we are aiming to create a generative adversarial reinforcement learning framework, which can work without having any explicit bounding box information. Instead of relying on bounding boxes, our framework uses tightly cropped object images as training data. Our image localization framework consists of two parts: a reinforcement learning agent (RL agent) and a discriminator. The RL agent takes input scenes and crops them with the objective of creating a tightly cropped object image. The discriminator tries to distinguish whether the image is generated by the RL agent or it comes from a tightly cropped object database. Experiments indicate that it is possible to achieve a promising localization performance without having explicit bounding box data. It can be concluded that generative adversarial reinforcement learning is an important tool in dealing with other learning problems where explicit input/output paired data is not available.


Object Detection with Convolutional Context Features
Kaya, Emre Can; Alatan, Abdullah Aydın (2017-01-01)
A novel extension to Huh B-ESA object detection algorithm is proposed in order to learn convolutional context features for determining boundaries of objects better. For input images, the hypothesis windows and their context around those windows are learned through convolutional layers as two parallel networks. The resulting object and context feature maps are combined in such a way that they preserve their spatial relationship. The proposed algorithm is trained and evaluated on PASCAL VOC 2007 detection ben...
Rescoring detections based on contextual scores in object detection
Zorlu, Ersan Vural; Akbaş, Emre; Department of Computer Engineering (2019)
To detect objects in an image, current state-of-the-art object detectors firstly definecandidate object locations, and then classify each of them into one of the predefinedcategories or as background. They do so by using the visual features extracted locallyfrom the candidate locations; omitting the rich contextual information embedded inthe whole image. Contextual information can be utilized to complement the informa-tion extracted locally and thereby to improve object detection accuracy. Researchershave p...
Fine-grained object recognition and zero-shot learning in multispectral imagery
Sumbul, Gencer; Cinbiş, Ramazan Gökberk; AKSOY, SELİM (2018-05-05)
We present a method for fine-grained object recognition problem, that aims to recognize the type of an object among a large number of sub-categories, and zero-shot learning scenario on multispectral images. In order to establish a relation between seen classes and new unseen classes, a compatibility function between image features extracted from a convolutional neural network and auxiliary information of classes is learnt. Knowledge transfer for unseen classes is carried out by maximizing this function. Per...
Object-based image labeling through learning by example and multi-level segmentation
Xu, Y; Duygulu, P; Saber, E; Tekalp, AM; Yarman Vural, Fatoş Tunay (Elsevier BV, 2003-06-01)
We propose a method for automatic extraction and labeling of semantically meaningful image objects using "learning by example" and threshold-free multi-level image segmentation. The proposed method scans through images, each of which is pre-segmented into a hierarchical uniformity tree, to seek and label objects that are similar to an example object presented by the user. By representing images with stacks of multi-level segmentation maps, objects can be extracted in the segmentation map level with adequate...
Multisource region attention network for fine-grained object recognition in remote sensing imagery
Sümbül, Gencer; Cinbiş, Ramazan Gökberk; Aksoy, Selim (Institute of Electrical and Electronics Engineers (IEEE), 2019-07)
Fine-grained object recognition concerns the identification of the type of an object among a large number of closely related subcategories. Multisource data analysis that aims to leverage the complementary spectral, spatial, and structural information embedded in different sources is a promising direction toward solving the fine-grained recognition problem that involves low between-class variance, small training set sizes for rare classes, and class imbalance. However, the common assumption of coregistered ...
Citation Formats
E. Halici and A. A. Alatan, “OBJECT LOCALIZATION WITHOUT BOUNDING BOX INFORMATION USING GENERATIVE ADVERSARIAL REINFORCEMENT LEARNING,” 2018, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/55651.