Semantik video modeling and retrieval with visual, auditory, textual sources

Download

index.pdf

Date

2004

Author

Durak, Nurcan

Metadata

Show full item record

Item Usage Stats

228
views

104
downloads

The studies on content-based video indexing and retrieval aim at accessing video content from different aspects more efficiently and effectively. Most of the studies have concentrated on the visual component of video content in modeling and retrieving the video content. Beside visual component, much valuable information is also carried in other media components, such as superimposed text, closed captions, audio, and speech that accompany the pictorial component. In this study, semantic content of video is modeled using visual, auditory, and textual components. In the visual domain, visual events, visual objects, and spatial characteristics of visual objects are extracted. In the auditory domain, auditory events and auditory objects are extracted. In textual domain, speech transcripts and visible texts are considered. With our proposed model, users can access video content from different aspects and get desired information more quickly. Beside multimodality, our model is constituted on semantic hierarchies that enable querying the video content at different semantic levels. There are sequence-scene hierarchies in visual domain, background-foreground hierarchies in auditory domain, and subject hierarchies in speech domain. Presented model has been implemented and multimodal content queries, hierarchical queries, fuzzy spatial queries, fuzzy regional queries, fuzzy spatio-temporal queries, and temporal queries have been applied on video content successfully.

Subject Keywords

Electronic computers.

URI

http://etd.lib.metu.edu.tr/upload/12605393/index.pdf
https://hdl.handle.net/11511/14398

Collections

Graduate School of Natural and Applied Sciences, Thesis

Suggestions

OpenMETU
Core

Visual quality assessment for stereoscopic video sequences Sarıkan, Selim Sefa; Akar, Gözde; Department of Electrical and Electronics Engineering (2011) The aim of this study is to understand the effect of different depth levels on the overall 3D quality and develop an objective video quality metric for stereoscopic video sequences. Proposed method is designed to be used in video coding stages to improve overall 3D video quality. This study includes both objective and subjective evaluation. Test sequences with different coding schemes are used. Computer simulation results show that overall quality has a strong correlation with the quality of the background,...
Structural and event based multimodal video data modeling Öztarak, Hakan; Yazıcı, Adnan; Department of Computer Engineering (2005) Investments on multimedia technology enable us to store many more reflections of the real world in digital world as videos. By recording videos about real world entities, we carry a lot of information to the digital world directly. In order to store and efficiently query this information, a video database system (VDBS) is necessary. In this thesis work, we propose a structural, event based and multimodal (SEBM) video data model for VDBSs. SEBM video data model supports three different modalities that are vi...
Safran: a distributed and parallel application development frameworks of hetereogeneous workstations Gölyeri, Hamza; Bozyiğit, Müslim; Department of Computer Engineering (2005) With the rapid advances in high-speed network technologies and steady decrease in the cost of hardware involved, network of workstation (NOW) environments began to attract attention as competitors against special purpose, high performance parallel processing environments. NOWs attract attention as parallel and distributed computing environments because they provide high scalability in terms of computing capacity and they have much smaller cost/performance ratios with high availability. However, they are har...
Comparison of rough multi layer perceptron and rough radial basis function networks using fuzzy attributes Vural, Hülya; Alpaslan, Ferda Nur; Department of Computer Engineering (2004) The hybridization of soft computing methods of Radial Basis Function (RBF) neural networks, Multi Layer Perceptron (MLP) neural networks with back-propagation learning, fuzzy sets and rough sets are studied in the scope of this thesis. Conventional MLP, conventional RBF, fuzzy MLP, fuzzy RBF, rough fuzzy MLP, and rough fuzzy RBF networks are compared. In the fuzzy neural networks implemented in this thesis, the input data and the desired outputs are given fuzzy membership values as the fuzzy properties أlow...
Recursive shortest spaning tree algorithms for image segmentation Bayramoğlu, Neslihan Yalçın; Bazlamaçcı, Cüneyt Fehmi; Department of Electrical and Electronics Engineering (2005) Image segmentation has an important role in image processing because it is a tool to obtain higher level object descriptions for further processing. In some applications such as large image databases or video image sequence segmentations, the speed of the segmentation algorithm may become a drawback of the application. This thesis work is a study to improve the run-time performance of a well-known segmentation algorithm, namely the Recursive Shortest Spanning Tree (RSST). Both the original and the fast RSST...

Citation Formats

N. Durak, “Semantik video modeling and retrieval with visual, auditory, textual sources,” M.S. - Master of Science, Middle East Technical University, 2004.