A Study on particle filter based audio-visual face tracking on the AV16.3 dataset

Download
2016
Yılmaz, Yunus Emre
People tracking has received considerable attention as a research field recently. Since, there are a wide range of application areas that requires to track single or multi target people in different environments with various scenarios using a variety of sensors. In this kind of tracking scenarios, usage of audio and visual information together is commonly preferred method, because these cues are mostly exist in the tracking environment and they contain complementary information about the targets. Our work focuses on particle filter based Bayesian tracking method that fuses location estimates obtained from audio and video data separately for indoor and crowded environments. Surveillance, video-conferencing and security are main examples of application areas for this kind of tracking scenario. In our work, particle filter based trackers are implemented with number of different configurations in order to nvestigate possible gains from including audio data to the tracking problem instead using only visual data. In these implementations, comprehensive experiments are conducted using the AV16.3 dataset. Usage of this dataset makes possible to compare our results with other works from the literature. Also, this dataset covers a variety of tracking situations (e.g. occlusions and rapid movements of persons) which can be encountered in realistic scenarios, making the results more useful. Our results indicates that no significant gains are possible when multiple cameras are used except when there are serious optical occlusions.

Suggestions

A survey on location estimation techniques for events detected in Twitter
Ozdikis, Ozer; Oğuztüzün, Mehmet Halit S.; Karagöz, Pınar (Springer Science and Business Media LLC, 2017-08-01)
Detection of events using voluntarily generated content in microblogs has been the objective of numerous recent studies. One essential challenge tackled in these studies is estimating the locations of events. In this paper, we review the state-of-the-art location estimation techniques used in the localization of events detected in microblogs, particularly in Twitter, which is one of the most popular microblogging platforms worldwide. We analyze these techniques with respect to the targeted event type, granu...
A new algorithm for automatic road network extraction in multispectral satellite images
Karaman, Ersin; Çınar, Umut; Gedik, Ekin; Çetin, Yasemin; Halıcı, Uğur (2012-05-09)
The aim of this study is to develop automatic road extraction algorithm in satellite images. As roads have different width and surface material characteristics in urban and rural areas, a modular approach for road extraction algorithm is desired. In this study, edge detection, segmentation, clustering and vegetation and land cover analyses are used. In order to combine the results of different methods, a score map based on segmentation analysis is constructed. Quantitative and visual results show that this ...
A Robust quality metric for image super resolution /
Kipman, Yiğit; Akar, Gözde; Department of Electrical and Electronics Engineering (2015)
Superresolution have become an active topic in image processing in the last decade. Various superresolution algorithms have been developed; however these superresolution algorithms may introduce defects such as blurring, aliasing, added noise and ringing. Evaluating the performance of these superresolution algorithms is an important problem; because the original high resolution image is not available while quantifying the quality of superresolution image. Subjective tests can be made to quantify the perceiv...
A Shadow based trainable method for building detection in satellite images
Dikmen, Mehmet; Halıcı, Uğur; Department of Geodetic and Geographical Information Technologies (2014)
The purpose of this thesis is to develop a supervised building detection and extraction algorithm with a shadow based learning method for high-resolution satellite images. First, shadow segments are identified on an over-segmented image, and then neighboring shadow segments are merged by assuming that they are cast by a single building. Next, these shadow regions are used to detect the candidate regions where buildings most likely occur. Together with this information, distance to shadows towards illuminati...
Comparison of Quantization Index Modulation and Forbidden Zone Data Hiding for Compressed Domain Video Data Hiding
Esen, Ersin; Dogan, Zafer; Ates, Tugrul K.; Alatan, Abdullah Aydın (2009-01-01)
Data hiding is now a part of daily life through various applications. In this work, we apply two data hiding methods, Quantization Index Modulation and Forbidden Zone Data Hiding, to video applications. We place these methods into a general video data hiding scheme and compare their performance against compression attacks. The results of the experiments with typical TV content indicate the superiority of FZDH, specifically for powerful attacks.
Citation Formats
Y. E. Yılmaz, “A Study on particle filter based audio-visual face tracking on the AV16.3 dataset,” M.S. - Master of Science, Middle East Technical University, 2016.