Object localization with reinforcement learning Pekiştirmeli öǧrenme ile nesne yeri saptama

Reinforcement Learning, which has recently started to be used more and more in computer vision applications, is mostly utilized in solving object tracking and object localization problems. Most of the time, training these problems makes use of bounding box information; however, producing bounding box information requires meticulous human effort. In this work, a framework for solving the object localization problem without using bounding boxes is proposed. Instead of bounding boxes; a database of tightly cropped images and a database of uncropped scenes is required. The framework consists of two parts: A reinforcement learning agent that tries to produce tightly cropped images from uncropped scenes and a discriminator which aims to determine whether an image is generated by the reinforcement learning agent or it belongs to the distribution of the tightly cropped image database. The experiment results indicate that achieving a promising localization performance is possible without using explicit bounding box information.