Exploiting information extraction techniques for automatic semantic annotation and retrieval of news videos in Turkish

Küçük, Dilek
Information extraction (IE) is known to be an effective technique for automatic semantic indexing of news texts. In this study, we propose a text-based fully automated system for the semantic annotation and retrieval of news videos in Turkish which exploits several IE techniques on the video texts. The IE techniques employed by the system include named entity recognition, automatic hyperlinking, person entity extraction with coreference resolution, and event extraction. The system utilizes the outputs of the components implementing these IE techniques as the semantic annotations for the underlying news video archives. Apart from the IE components, the proposed system comprises a news video database in addition to components for news story segmentation, sliding text recognition, and semantic video retrieval. We also propose a semi-automatic counterpart of system where the only manual intervention takes place during text extraction. Both systems are executed on genuine video data sets consisting of videos broadcasted by Turkish Radio and Television Corporation. The current study is significant as it proposes the first fully automated system to facilitate semantic annotation and retrieval of news videos in Turkish, yet the proposed system and its semi-automated counterpart are quite generic and hence they could be customized to build similar systems for video archives in other languages as well. Moreover, IE research on Turkish texts is known to be rare and within the course of this study, we have proposed and implemented novel techniques for several IE tasks on Turkish texts. As an application example, we have demonstrated the utilization of the implemented IE components to facilitate multilingual video retrieval.


Selective word encoding for effective text representation
Ozkan, Savas; Ozkan, Akin (The Scientific and Technological Research Council of Turkey, 2019-01-01)
Determining the category of a text document from its semantic content is highly motivated in the literature and it has been extensively studied in various applications. Also, the compact representation of the text is a fundamental step in achieving precise results for the applications and the studies are generously concentrated to improve its performance. In particular, the studies which exploit the aggregation of word-level representations are the mainstream techniques used in the problem. In this paper, w...
Using social graphs in one-class collaborative filtering problem
Kaya, Hamza; Alpaslan, Ferda Nur; Department of Computer Engineering (2009)
One-class collaborative filtering is a special type of collaborative filtering methods that aims to deal with datasets that lack counter-examples. In this work, we introduced social networks as a new data source to the one-class collaborative filtering (OCCF) methods and sought ways to benefit from them when dealing with OCCF problems. We divided our research into two parts. In the first part, we proposed different weighting schemes based on social graphs for some well known OCCF algorithms. One of the weig...
Automatic image annotation by ensemble of visual descriptors
Akbaş, Emre; Yarman Vural, Fatoş Tunay; Department of Computer Engineering (2006)
Automatic image annotation is the process of automatically producing words to de- scribe the content for a given image. It provides us with a natural means of semantic indexing for content based image retrieval. In this thesis, two novel automatic image annotation systems targeting dierent types of annotated data are proposed. The rst system, called Supervised Ensemble of Visual Descriptors (SEVD), is trained on a set of annotated images with predened class labels. Then, the system auto- matically annotates...
Exploiting interclass rules for focused crawling
Altıngövde, İsmail Sengör (Institute of Electrical and Electronics Engineers (IEEE), 2004-11-01)
A focused crawler gathers relevant Web pages on a particular topic. This rule-based Web-crawling approach uses linkage statistics among topics to improve. a baseline focused crawler's harvest rate and coverage.
Routing optimization methods for communication networks
Demircan, Ahmet Emrah; Leblebicioğlu, Mehmet Kemal; Department of Electrical and Electronics Engineering (2005)
This study discusses the routing optimization techniques and algorithms for communication networks. Preventing data loss on overloaded communication links and utilizing link bandwidths efficiently are the main problems of traffic engineering. Load balancing and routing problems are solved using both by heuristics such as genetic algorithms, and simulation techniques. These algorithms work on destination-based or flow-based routing techniques and mainly change the link weight system or try to select the best...
Citation Formats
D. Küçük, “Exploiting information extraction techniques for automatic semantic annotation and retrieval of news videos in Turkish,” Ph.D. - Doctoral Program, Middle East Technical University, 2011.