Identification of Coreferential Chains in Video Texts for Semantic Annotation of News Videos

2008-10-29
Kucuk, Dilek
Yazıcı, Adnan
People can benefit from today's video archives of huge sizes only through appropriate and effective ways of querying the video data. In order to query the video data through high-level semantic entities such as objects, events, and relations, these entities should be properly extracted and the corresponding video shots should be annotated accordingly. Video texts, which comprise the caption texts on the frames as well as transcription texts obtained through automatic speech recognition techniques, are valuable sources of information for semantic modeling of the videos. In this paper, we present an approach for the extraction of semantic objects from videos by utilizing lexical resources along with the identification of coreference chains in the corresponding video texts. Coreference is a phenomenon in natural language texts where a number of entities in the text refer to the same real world entity. Therefore, while the domain-specific lexical resources aid in the determination of salient entities in the video text, the identification of coreference chains prevents the superfluous extraction of the same underlying entities due to their different surface forms in the video texts. The proposed approach is significant for its being the first attempt to address the importance of coreference phenomenon in video texts for precise entity extraction during the semantic modeling of news videos with a hands-on application. The approach has been evaluated on Turkish political news texts from the METU Turkish corpus and a number of evaluation problems faced such as sparseness of annotated evaluation data for Turkish are also pointed out together with further research directions to pursue.

Suggestions

Utilization of texture, contrast and color homogeneity for detecting and recognizing text from video frames
Tekinalp, S; Alatan, Abdullah Aydın (2003-09-17)
It is possible to index and manage large video archives in a more efficient manner by detecting and recognizing text within video frames. There are some inherent properties of videotext, such as distinguishing texture, higher contrast against background, and uniform color, making it detectable. By employing these properties, it is possible to detect text regions and binarize the image for character recognition. In this paper, a complete framework for detection and recognition of videotext is presented. The ...
Optimization of an online course with web usage mining
Akman, LE; Akkan, B; Baykal, Nazife (2004-02-18)
The huge amount of information existing in the World Wide Web constitutes an ideal environment to implement data mining techniques. Web mining is the mining of web data. There are different applications of web mining: web content mining, web structure mining and web usage mining. In our study we analyzed an online course by web usage mining techniques in order to optimize the navigation paths, the duration of the time spend on each page and the number of visits throughout the semester of the course. Moreove...
Exploiting information extraction techniques for automatic semantic video indexing with an application to Turkish news videos
Kucuk, Dilek; Yazıcı, Adnan (Elsevier BV, 2011-08-01)
This paper targets at the problem of automatic semantic indexing of news videos by presenting a video annotation and retrieval system which is able to perform automatic semantic annotation of news video archives and provide access to the archives via these annotations. The presented system relies on the video texts as the information source and exploits several information extraction techniques on these texts to arrive at representative semantic information regarding the underlying videos. These techniques ...
A PARAMETRIC VIDEO QUALITY MODEL BASED ON SOURCE AND NETWORK CHARACTERISTICS
Zerman, Emin; Konuk, Baris; NUR YILMAZ, GÖKÇE; Akar, Gözde (2014-10-30)
The increasing demand for streaming video raises the need for flexible and easily implemented Video Quality Assessment (VQA) metrics. Although there are different VQA metrics, most of these are either Full-Reference (FR) or Reduced-Reference (RR). Both FR and RR metrics bring challenges for on-the-fly multimedia systems due to the necessity of additional network traffic for reference data. No-eference (NR) video metrics, on the other hand, as the name suggests, are much more flexible for user-end applicatio...
Design of H.264/AVC compatible intra-frame video encoder on FPGA programmable logic devices /
Günay, Ömer; Kamışlı, Fatih; Department of Electrical and Electronics Engineering (2014)
Video compression is a technique used to reduce the amount of data in a video to limit the amount of storage space and bandwidth it requires. H.264/AVC is a widely used video compression standard developed together by the ISO (International Organization for Standardization) Moving Picture Experts Group (MPEG) and the ITU (International Telecommunication Union) Video Coding Experts Group (VCEG). H.264/AVC offers an extended range of algorithms for coding digital video to achieve superior compression efficien...
Citation Formats
D. Kucuk and A. Yazıcı, “Identification of Coreferential Chains in Video Texts for Semantic Annotation of News Videos,” 2008, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/37461.