Human body part detection and multi-human tracking in surveillance videos

Download
2012
Topçu, Hasan Hüseyin
With the recent developments in Computer Vision and Pattern Recognition, surveillance applications are equipped with the capabilities of event/activity understanding and interpretation which usually require recognizing humans in real world scenes. Real world scenes such as airports, streets and train stations are complex because they involve many people, complicated occlusions and cluttered backgrounds. Although complex real world scenes exist, human detectors have the capability to locate pedestrians accurately even in complex scenes and visual trackers have the capability to track targets in cluttered environments. The integration of visual object detection and tracking, which are the fundamental features of available surveillance applications, is one of the solutions for multi-human tracking problem in crowded scenes which is studied in this thesis. In this thesis, human body part detectors, which are capable of detecting human heads and human upper body parts, are trained with Support Vector Machines (SVM) by using Histogram of Oriented Gradients (HOG), which is one of the state-of-the-art descriptor for human detection. The training process is elaborated by investigating the effects of the parameters of the HOG descriptor. The human heads and upper body parts are searched in the region of interests (ROI) computed by detecting motion. In addition, these human body part detectors are integrated with a multi-human tracker which solves the data association problem with the Multi Scan Markov Chain Monte Carlo Data Association (MCMCDA) algorithm. Associated measurements of human upper body part locations are used for state correction for each track. State estimation is done through Kalman Filter. The performance of detectors are evaluated using MIT Pedestrian dataset and INRIA Human dataset.

Suggestions

Human action recognition for various input characteristics using 3 dimensional residual networks
Tüfekci, Gülin; Ulusoy, İlkay; Department of Electrical and Electronics Engineering (2019)
Action recognition using deep neural networks is a far-reaching research area which has been commonly utilized in applications such as statistical analysis of human behavior, detecting abnormalities using surveillance cameras and robotic systems. Previous studies have been performing researches to propose new machine learning algorithms and deep network architectures to obtain higher recognition accuracy levels. Instead of suggesting a network resulting in small accuracy gain, this thesis focuses on evaluat...
Data-driven image captioning via salient region discovery
Kilickaya, Mert; Akkuş, Burak Kerim; Çakıcı, Ruket; Erdem, Aykut; Erdem, Erkut; İKİZLER CİNBİŞ, NAZLI (Institution of Engineering and Technology (IET), 2017-09-01)
n the past few years, automatically generating descriptions for images has attracted a lot of attention in computer vision and natural language processing research. Among the existing approaches, data-driven methods have been proven to be highly effective. These methods compare the given image against a large set of training images to determine a set of relevant images, then generate a description using the associated captions. In this study, the authors propose to integrate an object-based semantic image r...
Infrared face recognition
Konuk, Uğur; Akar, Gözde; Department of Electrical and Electronics Engineering (2015)
Face recognition is a leading biometrics technique that fulfills the increasing need to identify a person in today’s world. Face recognition also has broad range of utilization, such as commercial and law enforcement applications. That is the reason why it still gathers a lot of attention and is an active research topic. Nevertheless visible spectrum face recognition algorithms are not free of challenges. Illumination, pose, expression variances and existence of facial disguises still degrade the performanc...
A comparison on textured motion classification
Oztekin, Kaan; Akar, Gözde (2006-01-01)
Textured motion - generally known as dynamic or temporal texture analysis, classification, synthesis, segmentation and recognition is popular research areas in several fields such as computer vision, robotics, animation, multimedia databases etc. In the literature, several algorithms are proposed to characterize these textured motions such as stochastic and deterministic algorithms. However, there is no study which compares the performances of these algorithms. In this paper, we carry out a complete compari...
Visual object detection and tracking using local convolutional context features and recurrent neural networks
Kaya, Emre Can; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2018)
Visual object detection and tracking are two major problems in computer vision which have important real-life application areas. During the last decade, Convolutional Neural Networks (CNNs) have received significant attention and outperformed methods that rely on handcrafted representations in both detection and tracking. On the other hand, Recurrent Neural Networks (RNNs) are commonly preferred for modeling sequential data such as video sequences. A novel convolutional context feature extension is introduc...
Citation Formats
H. H. Topçu, “Human body part detection and multi-human tracking in surveillance videos,” M.S. - Master of Science, Middle East Technical University, 2012.