Automatic semantic content extraction in videos using a spatio-temporal ontology model

Download
2009
Yıldırım, Yakup
Recent increase in the use of video in many applications has revealed the need for extracting the content in videos. Raw data and low-level features alone are not sufficient to fulfill the user's need; that is, a deeper understanding of the content at the semantic level is required. Currently, manual techniques are being used to bridge the gap between low-level representative features and high-level semantic content, which are inefficient, subjective and costly in time and have limitations on querying capabilities. Therefore, there is an urgent need for automatic semantic content extraction from videos. As a result of this requirement, we propose an automatic semantic content extraction system for videos in terms of object, event and concept extraction. We introduce a general purpose ontology-based video semantic content model that uses object definitions, spatial relations and temporal relations in event and concept definitions. Various relation types are defined to describe fuzzy spatio-temporal relations between ontology classes. Thus, the video semantic content model is utilized to construct domain ontologies. In addition, domain ontologies are enriched with rule definitions to lower spatial relation computation cost and to be able to define some complex situations more effectively. As a case study, we have performed a number experiments for event and concept extraction in videos for basketball and surveillance domains. We have obtained satisfactory precision and recall rates for object, event and concept extraction. A domain independent application for the proposed framework has been fully implemented and tested.

Suggestions

Automatic Semantic Content Extraction in Videos Using a Fuzzy Ontology and Rule-Based Model
Yildirim, Yakup; Yazıcı, Adnan; Yilmaz, Turgay (2013-01-01)
Recent increase in the use of video-based applications has revealed the need for extracting the content in videos. Raw data and low-level features alone are not sufficient to fulfill the user's needs; that is, a deeper understanding of the content at the semantic level is required. Currently, manual techniques, which are inefficient, subjective and costly in time and limit the querying capabilities, are being used to bridge the gap between low-level representative features and high-level semantic content. H...
Fusing semantic information extracted from visual, auditory and textual data of videos
Gönül, Elvan; Yazıcı, Adnan; Department of Computer Engineering (2012)
In recent years, due to the increasing usage of videos, manual information extraction is becoming insufficient to users. Therefore, extracting semantic information automatically turns out to be a serious requirement. Today, there exists some systems that extract semantic information automatically by using visual, auditory and textual data separately but the number of studies that uses more than one data source is very limited. As some studies on this topic have already shown, using multimodal video data for...
Semantik video modeling and retrieval with visual, auditory, textual sources
Durak, Nurcan; Yazıcı, Adnan; Department of Computer Engineering (2004)
The studies on content-based video indexing and retrieval aim at accessing video content from different aspects more efficiently and effectively. Most of the studies have concentrated on the visual component of video content in modeling and retrieving the video content. Beside visual component, much valuable information is also carried in other media components, such as superimposed text, closed captions, audio, and speech that accompany the pictorial component. In this study, semantic content of video is m...
Natural language querying for video databases
Erozel, Guzen; Çiçekli, Fehime Nihan; Cicekli, Ilyas (Elsevier BV, 2008-06-15)
The video databases have become popular in various areas due to the recent advances in technology. Video archive systems need user-friendly interfaces to retrieve video frames. In this paper, a user interface based on natural language processing (NLP) to a video database system is described. The video database is based on a content-based spatio-temporal video data model. The data model is focused on the semantic content which includes objects, activities, and spatial properties of objects. Spatio-temporal r...
ENHANCED SPATIO-TEMPORAL VIDEO COPY DETECTION BY COMBINING TRAJECTORY AND SPATIAL CONSISTENCY
Ozkan, Savas; Esen, Ersin; Akar, Gözde (2014-10-30)
The recent improvements on internet technologies and video coding techniques cause an increase in copyright infringements especially for video. Frequently, image-based approaches appear as an essential solution due to the fact that joint usage of quantization-based indexing and weak geometric consistency stages give a capability to compare duplicate videos quickly. However, exploiting purely spatial content ignores the temporal variation of video. In this work, we propose a system that combines the state-of...
Citation Formats
Y. Yıldırım, “Automatic semantic content extraction in videos using a spatio-temporal ontology model,” Ph.D. - Doctoral Program, Middle East Technical University, 2009.