Online and semi-automatic annotation of faces in personal videos

Download
2010
Yılmaztürk, Mehmet Celaleddin
Video annotation has become an important issue due to the rapidly increasing amount of video available. For efficient video content searches, annotation has to be done beforehand, which is a time-consuming process if done manually. Automatic annotation of faces for person identification is a major challenge in the context of content-based video retrieval. This thesis work focuses on the development of a semi-automatic face annotation system which benefits from online learning methods. The system creates a face database by using face detection and tracking algorithms to collect samples of the encountered faces in the video and by receiving labels from the user. Using this database a learner model is trained. While the training session continues, the system starts offering labels for the newly encountered faces and lets the user acknowledge or correct the suggested labels hence a learner is updated online throughout the video. The user is free to train the learner until satisfactory results are obtained. In order to create a face database, a shot boundary algorithm is implemented to partition the video into semantically meaningful segments and the user browses through the video from one shot boundary to the next. A face detector followed by a face tracker is implemented to collect face samples within two shot boundary frames. For online learning, feature extraction and classification methods which are computationally efficient are investigated and evaluated. Sequential variants of some robust batch classification algorithms are implemented. Combinations of feature extraction and classification methods have been tested and compared according to their face recognition accuracy and computational performances.

Suggestions

Error resilient layered stereoscopic video streaming
Tan, A. Serdar; Aksay, Anil; Bilen, Cagdas; Akar, Gözde; ARIKAN, ERDAL (2007-05-09)
In this paper, error resilient stereoscopic video streaming problem is addressed. Two different Forward Error Correction (FEC) codes namely Systematic LT and RS codes are utilized to protect the stereoscopic video data against transmission errors. Initially, the stereoscopic video is categorized in 3 layers with different priorities. Then, a packetization scheme is used to increase the efficiency of error protection. A comparative analysis of RS and LT codes are provided via simulations to observe the optim...
Implementation of a distributed video codec
Işık, Cem Vedat; Akar, Gözde; Department of Electrical and Electronics Engineering (2008)
Current interframe video compression standards such as the MPEG4 and H.264, require a high-complexity encoder for predictive coding to exploit the similarities among successive video frames. This requirement is acceptable for cases where the video sequence to be transmitted is encoded once and decoded many times. However, some emerging applications such as video-based sensor networks, power-aware surveillance and mobile video communication systems require computational complexity to be shifted from encoder ...
Automatic semantic content extraction in videos using a spatio-temporal ontology model
Yıldırım, Yakup; Yazıcı, Adnan; Department of Computer Engineering (2009)
Recent increase in the use of video in many applications has revealed the need for extracting the content in videos. Raw data and low-level features alone are not sufficient to fulfill the user's need; that is, a deeper understanding of the content at the semantic level is required. Currently, manual techniques are being used to bridge the gap between low-level representative features and high-level semantic content, which are inefficient, subjective and costly in time and have limitations on querying capab...
Oblivious spatio-temporal watermarking of digital video by exploiting the human visual system
Koz, Alper; Alatan, Abdullah Aydın (2008-03-01)
Imperceptibility requirement in video watermarking is more challenging compared with its image counterpart due to the additional dimension existing in video. The embedding system should not only yield spatially invisible watermarks for each frame of the video, but it should also take the temporal dimension into account in order to avoid any flicker distortion between frames. While some of the methods in the literature approach this problem by only allowing arbitrarily small modifications within frames in di...
Hybrid Fault Tolerant Peer to Peer Video Streaming Architecture
Oeztoprak, Kasim; Akar, Gözde (Institute of Electronics, Information and Communications Engineers (IEICE), 2008-11-01)
In this paper, we propose a fault tolerant hybrid p2p-CDN video streaming arhitecture to overcome the problems caused by peer behavior in peer-to-peer (P2P) video streaming systems. Although there are several studies modeling and analytically investigating peer behaviors in P2P video streaming systems, they do not COMe LIP with a solution to guarantee the required Quality of the Services (QoS). Therefore, in this study a hybrid geographical location-time and interest based clustering algorithm is proposed t...
Citation Formats
M. C. Yılmaztürk, “Online and semi-automatic annotation of faces in personal videos,” M.S. - Master of Science, Middle East Technical University, 2010.