Object detection through search with a foveated visual system

2017-10-01
Humans and many other species sense visual information with varying spatial resolution across the visual field (foveated vision) and deploy eye movements to actively sample regions of interests in scenes. The advantage of such varying resolution architecture is a reduced computational, hence metabolic cost. But what are the performance costs of such processing strategy relative to a scheme that processes the visual field at high spatial resolution? Here we first focus on visual search and combine object detectors from computer vision with a recent model of peripheral pooling regions found at the V1 layer of the human visual system. We develop a foveated object detector that processes the entire scene with varying resolution, uses retino-specific object detection classifiers to guide eye movements, aligns its fovea with regions of interest in the input image and integrates observations across multiple fixations. We compared the foveated object detector against a non-foveated version of the same object detector which processes the entire image at with homogeneous high spatial resolution. We evaluated the accuracy of the foveated and non-foveated object detectors identifying 20 different objects classes in scenes from a standard computer vision data set (the PASCAL VOC 2007 dataset). We show that the foveated object detector can approximate the performance of the object detector with homogeneous high spatial resolution processing while bringing significant computational cost savings. Additionally, we assessed the impact of foveation on the computation of bottom-up saliency. An implementation of a simple foveated bottom-up saliency model with eye movements showed agreement in the selection of top salient regions of scenes with those selected by a non-foveated high resolution saliency model. Together, our results might help explain the evolution of foveated visual systems with eye movements as a solution that preserves perceptual performance in visual search while resulting in computational and metabolic savings to the brain.
PLOS COMPUTATIONAL BIOLOGY

Suggestions

Visual object representations: effects of feature frequency and similarity
Eren Kanat, Selda; Hohenberger, Annette Edeltraud; Department of Cognitive Sciences (2011)
The effects of feature frequency and similarity on object recognition have been examined through behavioral experiments, and a model of the formation of visual object representations and old/new recognition has been proposed. A number of experiments were conducted to test the hypothesis that frequency and similarity of object features affect the old/new responses to test stimuli in a later recognition task. In the first experiment, when the feature frequencies are controlled, there was a significant increas...
Anomaly Detection and Activity Perception Using Covariance Descriptor for Trajectories
Ergezer, Hamza; Leblebicioğlu, Mehmet Kemal (2016-10-16)
In this work, we study the problems of anomaly detection and activity perception through the trajectories of objects in crowded scenes. For this purpose, we propose a novel representation for trajectories via covariance features. Representing trajectories via feature covariance matrices enables us to calculate the distance between the trajectories of different lengths. After setting this proposed representation and calculation of distances between trajectories, anomaly detection is achieved by sparse repres...
Visual object detection and tracking using local convolutional context features and recurrent neural networks
Kaya, Emre Can; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2018)
Visual object detection and tracking are two major problems in computer vision which have important real-life application areas. During the last decade, Convolutional Neural Networks (CNNs) have received significant attention and outperformed methods that rely on handcrafted representations in both detection and tracking. On the other hand, Recurrent Neural Networks (RNNs) are commonly preferred for modeling sequential data such as video sequences. A novel convolutional context feature extension is introduc...
Shape descriptors based on intersection consistency and global binary patterns
Sivri, Erdal; Kalkan, Sinan; Department of Computer Engineering (2012)
Shape description is an important problem in computer vision because most vision tasks that require comparing or matching visual entities rely on shape descriptors. In this thesis, two novel shape descriptors are proposed, namely Intersection Consistency Histogram (ICH) and Global Binary Patterns (GBP). The former is based on a local regularity measure called Intersection Consistency (IC), which determines whether edge pixels in an image patch point towards the center or not. The second method, called Globa...
Visual detection and tracking of moving objects
Ergezer, Hamza; Leblebicioğlu, Mehmet Kemal; Department of Electrical and Electronics Engineering (2007)
In this study, primary steps of a visual surveillance system are presented: moving object detection and tracking of these moving objects. Background subtraction has been performed to detect the moving objects in the video, which has been taken from a static camera. Four methods, frame differencing, running (moving) average, eigenbackground subtraction and mixture of Gaussians, have been used in the background subtraction process. After background subtraction, using some additional operations, such as morpho...
Citation Formats
E. Akbaş, “Object detection through search with a foveated visual system,” PLOS COMPUTATIONAL BIOLOGY, pp. 0–0, 2017, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/48965.