Simultaneous bottom-up/top-down processing in early and mid level vision

Download
2008
Erdem, Mehmet Erkut
The prevalent view in computer vision since Marr is that visual perception is a data-driven bottom-up process. In this view, image data is processed in a feed-forward fashion where a sequence of independent visual modules transforms simple low-level cues into more complex abstract perceptual units. Over the years, a variety of techniques has been developed using this paradigm. Yet an important realization is that low-level visual cues are generally so ambiguous that they could make purely bottom-up methods quite unsuccessful. These ambiguities cannot be resolved without taking account of high-level contextual information. In this thesis, we explore different ways of enriching early and mid-level computer vision modules with a capacity to extract and use contextual knowledge. Mainly, we integrate low-level image features with contextual information within unied formulations where bottom-up and top-down processing take place simultaneously.

Suggestions

Human motion analysis via axis based representations
Erdem, Sezen; Tarı, Zehra Sibel; Department of Computer Engineering (2007)
Visual analysis of human motion is one of the active research areas in computer vision. The trend shifts from computing motion fields to understanding actions. In this thesis, an action coding scheme based on trajectories of the features calculated with respect to a part based coordinate system is presented. The part based coordinate system is formed using an axis based representation. The features are extracted from images segmented in the form of silhouettes. We present some preliminary experiments that d...
Foveated image watermarking
Koz, A; Alatan, Abdullah Aydın (2002-09-25)
The spatial resolution of the human visual system (HVS) decreases rapidly away from the point of fixation (foveation point). By exploiting this fact, we propose a watermarking approach that embeds the watermark energy into the image peripheral according to foveation-based HVS contrast thresholds. Compared to the other HVS-based watermarking methods, the simulation results demonstrate an improvement in the robustness of the proposed approach against image degradations, such as JPEG compression, cropping and ...
Efficient detection and tracking of salient regions for visual processing on mobile platforms
Serhat, Gülhan; Saranlı, Afşar; Department of Electrical and Electronics Engineering (2009)
Visual Attention is an interesting concept that constantly widens its application areas in the field of image processing and computer vision. The main idea of visual attention is to find the locations on the image that are visually attractive. In this thesis, the visually attractive regions are extracted and tracked in video sequences coming from the vision systems of mobile platforms. First, the salient regions are extracted in each frame and a feature vector is constructed for each one. Then Scale Invaria...
Video Shot Boundary Detection by Graph-theoretic Dominant Sets Approach
Asan, Emrah; Alatan, Abdullah Aydın (2009-09-16)
We present a video shot boundary detection algorithm based on the novel graph theoretic concept, namely dominant sets. Dominant sets are defined as a set of the nodes in a graph, mostly similar to each other and dissimilar to the others. In order to achieve this goal, candidate shot boundaries are determined by using simply pixelwise differences between consequent frames. For each candidate position, a testing sequence is constructed by considering 4 frames before the candidate position and 2 frames after t...
Joint Utilization of Appearance, Geometry and Chance for Scene Logo Retrieval
Soysal, Medeni; Alatan, Abdullah Aydın (Oxford University Press (OUP), 2011-07-01)
A novel approach involving the comparison of appearance and geometrical similarity of local patterns via a combined description is presented. Candidate groups of interest points are identified based on unlikeliness of being matched by chance. For each of the keypoints in these groups, a novel description is proposed. This description utilizes quantized appearance descriptors of interest points to avoid the necessity of matching each test descriptor to each template descriptor. Additionally, one-to-many matc...
Citation Formats
M. E. Erdem, “Simultaneous bottom-up/top-down processing in early and mid level vision,” Ph.D. - Doctoral Program, Middle East Technical University, 2008.