Summarizing video: Content, features, and HMM topologies

An algorithm is proposed for automatic summarization of multimedia content by segmenting digital video into semantic scenes using HMMs. Various multi-modal low-level features are extracted to determine state transitions in HMMs for summarization. Advantage of using different model topologies and observation sets in order to segment different content types is emphasized and verified by simulations. Performance of the proposed algorithm is also compared with a deterministic scene segmentation method. A better performance is observed due to the flexibility of HMMs in modeling different content types.


Recursive Prediction for Joint Spatial and Temporal Prediction in Video Coding
Kamışlı, Fatih (2014-06-01)
Video compression systems use prediction to reduce redundancies present in video sequences along the temporal and spatial dimensions. Standard video coding systems use either temporal or spatial prediction on a per block basis. If temporal prediction is used, spatial information is ignored. If spatial prediction is used, temporal information is ignored. This may be a computationally efficient approach, but it does not effectively combine temporal and spatial information. In this letter, we provide a framewo...
Depth assisted object segmentation in multi-view video
Cigla, Cevahir; Alatan, Abdullah Aydın (2008-01-01)
In this work, a novel and unified approach for multi-view video (MVV) object segmentation is presented. In the first stage, a region-based graph-theoretic color segmentation algorithm is proposed, in which the popular Normalized Cuts segmentation method is improved with some modifications on its graph structure. Segmentation is obtained by recursive bi-partitioning of a weighted graph of an initial over-segmentation mask. The available segmentation mask is also utilized during dense depth map estimation ste...
Streaming Multiscale Deep Equilibrium Models
Ertenli, Can Ufuk; Akbaş, Emre; Cinbiş, Ramazan Gökberk (2022-1-01)
We present StreamDEQ, a method that infers frame-wise representations on videos with minimal per-frame computation. In contrast to conventional methods where compute time grows at least linearly with the network depth, we aim to update the representations in a continuous manner. For this purpose, we leverage the recently emerging implicit layer models, which infer the representation of an image by solving a fixed-point problem. Our main insight is to leverage the slowly changing nature of videos and use the...
Graph-based multilevel temporal segmentation of scripted content videos
Sakarya, Ufuk; TELATAR, ZİYA (2007-06-13)
This paper concentrates on a graph-based multilevel temporal segmentation method for scripted content videos. In each level of the segmentation, a similarity matrix of frame strings, which are series of consecutive video frames, is constructed by using temporal and spatial contents of frame strings. A strength factor is estimated for each frame string by using a priori information of a scripted content. According to the similarity matrix reevaluated from a strength function derived by the strength factors, ...
Automatic categorization and summarization of documentaries
Demirtas, Kezban; Çiçekli, Fehime Nihan; ÇİÇEKLİ, İLYAS (2010-12-01)
In this paper, we propose automatic categorization and summarization of documentaries using subtitles of videos. We propose two methods for video categorization. The first makes unsupervised categorization by applying natural language processing techniques on video subtitles and uses the WordNet lexical database and WordNet domains. The second has the same extraction steps but uses a learning module to categorize. Experiments with documentary videos give promising results in discovering the correct categori...
Citation Formats
Y. Yasaroglu and A. A. Alatan, “Summarizing video: Content, features, and HMM topologies,” VISUAL CONTENT PROCESSING AND REPRESENTATION, PROCEEDINGS, pp. 101–110, 2003, Accessed: 00, 2020. [Online]. Available: