Utilization of texture, contrast and color homogeneity for detecting and recognizing text from video frames

It is possible to index and manage large video archives in a more efficient manner by detecting and recognizing text within video frames. There are some inherent properties of videotext, such as distinguishing texture, higher contrast against background, and uniform color, making it detectable. By employing these properties, it is possible to detect text regions and binarize the image for character recognition. In this paper, a complete framework for detection and recognition of videotext is presented. The results from Gabor-based texture analysis, contrast-based segmentation and color homogeneity are merged to obtain minimum number of candidate regions before binarization. The performance of the system is tested for its recognition rate for various combinations and it is observed that the results give recognition rates, reasonable for most practical purposes.


Alignment of uncalibrated images for multi-view classification
Arık, Sercan Ömer; Vural, Elif; Frossard, Pascal (2011-12-29)
Efficient solutions for the classification of multi-view images can be built on graph-based algorithms when little information is known about the scene or cameras. Such methods typically require a pairwise similarity measure between images, where a common choice is the Euclidean distance. However, the accuracy of the Euclidean distance as a similarity measure is restricted to cases where images are captured from nearby viewpoints. In settings with large transformations and viewpoint changes, alignment of im...
Image annotation with semi-supervised clustering
Sayar, Ahmet; Yarman Vural, Fatoş Tunay; Department of Computer Engineering (2009)
Image annotation is defined as generating a set of textual words for a given image, learning from the available training data consisting of visual image content and annotation words. Methods developed for image annotation usually make use of region clustering algorithms to quantize the visual information. Visual codebooks are generated from the region clusters of low level visual features. These codebooks are then, matched with the words of the text document related to the image, in various ways. In this th...
Exploiting information extraction techniques for automatic semantic video indexing with an application to Turkish news videos
Kucuk, Dilek; Yazıcı, Adnan (Elsevier BV, 2011-08-01)
This paper targets at the problem of automatic semantic indexing of news videos by presenting a video annotation and retrieval system which is able to perform automatic semantic annotation of news video archives and provide access to the archives via these annotations. The presented system relies on the video texts as the information source and exploits several information extraction techniques on these texts to arrive at representative semantic information regarding the underlying videos. These techniques ...
Depth assisted object segmentation in multi-view video
Cigla, Cevahir; Alatan, Abdullah Aydın (2008-01-01)
In this work, a novel and unified approach for multi-view video (MVV) object segmentation is presented. In the first stage, a region-based graph-theoretic color segmentation algorithm is proposed, in which the popular Normalized Cuts segmentation method is improved with some modifications on its graph structure. Segmentation is obtained by recursive bi-partitioning of a weighted graph of an initial over-segmentation mask. The available segmentation mask is also utilized during dense depth map estimation ste...
HANOLISTIC: A Hierarchical Automatic Image Annotation System Using Holistic Approach
Karadag, Ozge Oztimur; Yarman Vural, Fatoş Tunay (2009-06-25)
Automatic image annotation is the process of assigning keywords to digital images depending on the content information. In one sense, it is a mapping from the visual content information to the semantic context information. In this study, we propose a novel approach for automatic image annotation problem, where the annotation is formulated as a multivariate mapping from a set of independent descriptor spaces, representing a whole image, to a set of words, representing class labels. For this purpose, a hierar...
Citation Formats
S. Tekinalp and A. A. Alatan, “Utilization of texture, contrast and color homogeneity for detecting and recognizing text from video frames,” 2003, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/55182.