Employing Named Entities for Semantic Retrieval of News Videos in Turkish

2009-09-16
Kucuk, Dilek
Yazıcı, Adnan
Named entities are known to be important means for semantic annotation of news texts. Considerable work has been carried out for semantic indexing of both textual news and news videos especially in English through the employment of named entities extracted from textual news or transcriptions of the news videos. In this paper, we present our semantic retrieval architecture for news videos in Turkish based on prior semantic annotation of the videos with the corresponding named entities in the news transcription texts. We employ a rule-based named entity recognizer for Turkish which makes use of handcrafted sets of lexical resources and pattern bases. We compiled a small corpus of Turkish news videos and the named entity recognizer in its current form achieves a success rate of about 75% on this corpus. A retrieval interface is implemented to access the video corpus through the boolean queries formed with the extracted named entities. The interface currently does not involve any ranking procedure, displaying all the videos, the transcription texts of which satisfy the boolean query posed through the interface, sorted by their broadcast date. The presented study is significant for its being the first study to perform automatic semantic video annotation on a genuine news video corpus in Turkish and demonstrating the utilization of the annotations through a retrieval interface.

Suggestions

A Text-Based Fully Automated Architecture for the Semantic Annotation and Retrieval of Turkish News Videos
Kucuk, Dilek; Yazıcı, Adnan (2010-07-23)
Video texts are known to constitute an important source of information for semantic summaries of video archives. In this study, we propose a fully automated architecture for semantic annotation and later retrieval of Turkish news videos based on the corresponding video texts. At the core of the architecture is a named entity recognizer, the output of which on video texts is used as semantic annotations for the corresponding videos. The architecture also comprises components for news story segmentation, slid...
Fusion of multimodal information for multimedia information retrieval
Yılmaz, Turgay; Yazıcı, Adnan; Department of Computer Engineering (2014)
An effective retrieval of multimedia data is based on its semantic content. In order to extract the semantic content, the nature of multimedia data should be analyzed carefully and the information contained should be used completely. Multimedia data usually has a complex structure containing multimodal information. Noise in the data, non-universality of any single modality, and performance upper bound of each modality make it hard to rely on a single modality. Thus, multimodal fusion is a practical approach...
Exploiting information extraction techniques for automatic semantic annotation and retrieval of news videos in Turkish
Küçük, Dilek; Yazıcı, Adnan; Department of Computer Engineering (2011)
Information extraction (IE) is known to be an effective technique for automatic semantic indexing of news texts. In this study, we propose a text-based fully automated system for the semantic annotation and retrieval of news videos in Turkish which exploits several IE techniques on the video texts. The IE techniques employed by the system include named entity recognition, automatic hyperlinking, person entity extraction with coreference resolution, and event extraction. The system utilizes the outputs of th...
The use of articulator motion information in automatic speech segmentation
Akdemir, Eren; Çiloğlu, Tolga (Elsevier BV, 2008-07-01)
The use of articulator motion information in automatic speech segmentation is investigated. Automatic speech segmentation is an essential task in speech processing applications like speech synthesis where accuracy and consistency of segmentation are firmly connected to the quality of synthetic speech. The motions of upper and lower lips are incorporated into a hidden Markov model based segmentation process. The MOCHA-TIMIT database, which involves simultaneous articulatograph and microphone recordings, was ...
Multilingual Video Indexing and Retrieval Employing an Information Extraction Tool for Turkish News Texts: A Case Study
Kucuk, Dilek; Yazıcı, Adnan (2011-10-28)
In this paper, a multilingual video indexing and retrieval system is proposed which relies on an information extraction tool, a hybrid named entity recognizer, for Turkish to determine the semantic annotations for the considered videos. The system is executed on a set of news videos in English and encompasses several other components including an automatic speech recognition system for English, an English-to-Turkish machine translation system, a news video database, and a semantic video retrieval interface....
Citation Formats
D. Kucuk and A. Yazıcı, “Employing Named Entities for Semantic Retrieval of News Videos in Turkish,” 2009, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/46697.