Automatic multi-modal dialogue scene indexing

Date

2001-10-10

Author

Alatan, Abdullah Aydın

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

174
views

0
downloads

An automatic algorithm for indexing dialogue scenes in multimedia content is proposed The content is segmented into dialogue scenes using the state transitions of a hidden Markov model (HMM) Each shot is classified using both audio and visual information to determine the state/scene transitions for this model Face detection and silence/speech/music classification are the basic tools which are utilized to index the scenes While face information is extracted after applying some heuristics to skin-colored regions, audio analysis is achieved by examining signal energy, periodicity and zero crossing rate (ZCR) of the audio waveform The simulation results show the possibility of automatically indexing the dialogues using the proposed algorithm.

Subject Keywords

Layout, Indexing, Hidden markov models, Data mining, Motion pictures, Image analysis, Face detection, Information analysis, Natural languages, Signal analysis

URI

https://hdl.handle.net/11511/55886

Conference Name

International Conference on Image Processing (ICIP 2001)

Collections

Department of Electrical and Electronics Engineering, Conference / Seminar

Suggestions

OpenMETU
Core

Prioritized sequential 3D reconstruction in video sequences with multiple motions Imre, Evren; Knorr, Sebastian; Alatan, Abdullah Aydın; Sikora, Thomas (2006-10-11) in this study, an algorithm is proposed to solve the multi-frame structure from motion (MFSfM) problem for monocular video sequences in dynamic scenes. The algorithm uses the epipolar criterion to segment the features belonging to independently moving objects. Once the features are segmented, corresponding objects are reconstructed individually by using a sequential algorithm, which is also capable of prioritizing the frame pairs with respect to their reliability and information content, thus achieving a fa...
Linear Separability Analysis for Stacked Generalization Architecture Ozay, Mete; Vural, Fatos T. Yarman (2009-04-11) Stacked Generalization algorithm aims to increase the individual classification performances of the classifiers by combining the information obtained from various classifiers in a multilayer architecture by either linear or nonlinear techniques. Performance of the algorithm varies depending on the application domains and the space analyses that affect the classification performances could riot be applied successfully.
Summarizing video: Content, features, and HMM topologies Yasaroglu, Y; Alatan, Abdullah Aydın (2003-01-01) An algorithm is proposed for automatic summarization of multimedia content by segmenting digital video into semantic scenes using HMMs. Various multi-modal low-level features are extracted to determine state transitions in HMMs for summarization. Advantage of using different model topologies and observation sets in order to segment different content types is emphasized and verified by simulations. Performance of the proposed algorithm is also compared with a deterministic scene segmentation method. A better...
Human action recognition with line and flow histograms İKİZLER CİNBİŞ, NAZLI; Cinbiş, Ramazan Gökberk; DUYGULU ŞAHİN, PINAR (2008-12-11) We present a compact representation for human action recognition in videos using line and optical flow histograms. We introduce a new shape descriptor based on the distribution of lines which are fitted to boundaries of human figures. By using an entropy-based approach, we apply feature selection to densify our feature representation, thus, minimizing classification time without degrading accuracy. We also use a compact representation of optical flow for motion information. Using line and flow histograms to...
Joint source-channel coding for error resilient transmission of static 3D models Bici, Mehmet Oguz; Norkin, Andrey; Akar, Gözde (2012-01-01) In this paper, performance analysis of joint source-channel coding techniques for error-resilient transmission of three dimensional (3D) models are presented. In particular, packet based transmission scenarios are analyzed. The packet loss resilient methods are classified into two groups according to progressive compression schemes employed: Compressed Progressive Meshes (CPM) based methods and wavelet based methods. In the first group, layers of CPM algorithm are protected unequally by Forward Error Correc...

Citation Formats

A. A. Alatan, “Automatic multi-modal dialogue scene indexing,” presented at the International Conference on Image Processing (ICIP 2001), THESSALONIKI, GREECE, 2001, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/55886.