Optical flow based video frame segmentation and segment classification

Akpınar, Samet
Video information retrieval is a field of multimedia research enabling us to extract desired semantic information from video data. In content-based video information retrieval, visual content obtained from video scenes is utilized. For developing methods to cope with content-based video information retrieval in terms of temporal concepts such as action, event, etc., representation of temporal information becomes critical. In this thesis, action detection is tackled based on a temporal video representation model. Herein, the visual feature - optical flow - is our basic construct used to formalize video parts as temporal information. In the proposed model, video action detection is considered over a pieced approach composed of two parts; Temporal video segment classification and temporal video segmentation. In the first part, weighted frame velocity concept is put forward and associated with the optical flow vectors. The associated representation is used in action based video segment classification. The second part contains a new temporal video segmentation methodology providing segment candidates to segment classification methods generally. The methodology brings an approach strengthening the pixel based cut detection methods with the motion based ones. Average motion vectors are presented based on the optical flow vectors and used in pixel matching. A binary cut classification is applied to the obtained representation enriched with a sliding window based approach. Proposed methods are applied to different data sets. Analysis of the results with the state of the art methods shows that proposed temporal representation models and concepts increased the segment and cut classification performances.


Alignment of uncalibrated images for multi-view classification
Arık, Sercan Ömer; Vural, Elif; Frossard, Pascal (2011-12-29)
Efficient solutions for the classification of multi-view images can be built on graph-based algorithms when little information is known about the scene or cameras. Such methods typically require a pairwise similarity measure between images, where a common choice is the Euclidean distance. However, the accuracy of the Euclidean distance as a similarity measure is restricted to cases where images are captured from nearby viewpoints. In settings with large transformations and viewpoint changes, alignment of im...
Design and implementation of a novel visual analysis system for image clasiffication
Altintakan, Ümit Lütfü; Yazıcı, Adnan; Körpeoğlu, İbrahim; Department of Computer Engineering (2013)
Possibilities offered by the technology to create, share and disseminate image and video data have resulted in a rapid increase in the available visual data. However, the data is useless unless it is effectively accessed, which necessitates the semantic analysis of visual data. In this dissertation, we present a novel visual analysis system along with its application to image classification problem. We aim to address the challenges in the area originated from the semantic gap, and to facilitate the research...
Camera electronics and image enhancement software for infrared detector arrays
Küçükkömürler, Alper; Akın, Tayfun; Department of Environmental Engineering (2012)
This thesis aims to design and develop camera electronics and image enhancement software for infrared detector arrays. It first discusses the camera electronics suitable for infrared detector arrays, then it concentrates on image enhancement software that are implemented including defective pixel correction, contrast enhancement, noise reduction and pseudo coloring. After that, testing and results of the implemented algorithms were presented. Camera electronics and circuit operation frequency are selected c...
HANOLISTIC: A Hierarchical Automatic Image Annotation System Using Holistic Approach
Karadag, Ozge Oztimur; Yarman Vural, Fatoş Tunay (2009-06-25)
Automatic image annotation is the process of assigning keywords to digital images depending on the content information. In one sense, it is a mapping from the visual content information to the semantic context information. In this study, we propose a novel approach for automatic image annotation problem, where the annotation is formulated as a multivariate mapping from a set of independent descriptor spaces, representing a whole image, to a set of words, representing class labels. For this purpose, a hierar...
Multimedia Information Retrieval Using Fuzzy Cluster-Based Model Learning
Sattari, Saeid; Yazıcı, Adnan (2017-07-12)
Multimedia data, particularly digital videos, which contain various modalities (visual, audio, and text) are complex and time consuming to model, process, and retrieve. Therefore, efficient methods are required for retrieval of such complex data. In this paper, we propose a multimodal query level fusion approach using a fuzzy cluster-based learning method to improve the retrieval performance of multimedia data. Experimental results on a real dataset demonstrate that employing fuzzy clustering achieves notab...
Citation Formats
S. Akpınar, “Optical flow based video frame segmentation and segment classification,” Ph.D. - Doctoral Program, Middle East Technical University, 2018.