Human action recognition for various input characteristics using 3 dimensional residual networks

Tüfekci, Gülin
Action recognition using deep neural networks is a far-reaching research area which has been commonly utilized in applications such as statistical analysis of human behavior, detecting abnormalities using surveillance cameras and robotic systems. Previous studies have been performing researches to propose new machine learning algorithms and deep network architectures to obtain higher recognition accuracy levels. Instead of suggesting a network resulting in small accuracy gain, this thesis focuses on evaluating different input characteristics for increasing the learning capacity of the networks. To do so, 3-dimensional residual networks are utilized because of their effective learning process. Among all the modifications applied on the inputs, increasing the sample duration up to 60 frames and masking the RGB pixel values with the motion flow between consecutive frames provide high accuracy gains. Employing 60 frames instead of 16 frames quadruples the computation time while achieving an accuracy increase of 10%. Masking the frames results in 12% recognition accuracy gain. Both modifications contribute to the learning process of the network by emphasizing the relations between patterns through longer temporal extents and guiding the network to focus on the areas where the main action takes place. Obtaining significant amounts of accuracy gains by only modifying the input is outstanding. Moreover, the recognition accuracy is enhanced even more by pre-training the network on a large scale dataset. The contributions of the results of this thesis are worthwhile since the input characteristics yielding high accuracy gains can be used for different networks to increase the recognition accuracy.


A comparison on textured motion classification
Oztekin, Kaan; Akar, Gözde (2006-01-01)
Textured motion - generally known as dynamic or temporal texture analysis, classification, synthesis, segmentation and recognition is popular research areas in several fields such as computer vision, robotics, animation, multimedia databases etc. In the literature, several algorithms are proposed to characterize these textured motions such as stochastic and deterministic algorithms. However, there is no study which compares the performances of these algorithms. In this paper, we carry out a complete compari...
Human body part detection and multi-human tracking in surveillance videos
Topçu, Hasan Hüseyin; Çiçekli, Fehime Nihan; Ulusoy, İlkay; Department of Computer Engineering (2012)
With the recent developments in Computer Vision and Pattern Recognition, surveillance applications are equipped with the capabilities of event/activity understanding and interpretation which usually require recognizing humans in real world scenes. Real world scenes such as airports, streets and train stations are complex because they involve many people, complicated occlusions and cluttered backgrounds. Although complex real world scenes exist, human detectors have the capability to locate pedestrians accur...
Deep Hierarchies in the Primate Visual Cortex: What Can We Learn for Computer Vision?
KRÜGER, Norbert; JANSSEN, Peter; Kalkan, Sinan; LAPPE, Markus; LEONARDİS, Ales; PİATER, Justus; Rodriguez-Sanchez, Antonio J.; WİSKOTT, Laurenz (Institute of Electrical and Electronics Engineers (IEEE), 2013-08-01)
Computational modeling of the primate visual system yields insights of potential relevance to some of the challenges that computer vision is facing, such as object recognition and categorization, motion detection and activity recognition, or vision-based navigation and manipulation. This paper reviews some functional principles and structures that are generally thought to underlie the primate visual cortex, and attempts to extract biological principles that could further advance computer vision research. Or...
Comparison of deep networks for gesture recognition
Sofu, Buğra; Ulusoy, İlkay; Department of Electrical and Electronics Engineering (2021-9-06)
Gesture recognition is an important problem and has been studied over the years especially in the fields such as surveillance systems, analysis of human behavior, robotics etc. In this thesis, different state of art algorithms, which are based on deep learning, were implemented and compared considering model complexities and accuracies. Also, a new approach was proposed and compared with them. Tested algorithms can be classified into two main categories: hybrid approaches, which use CNN and LSTM architectu...
Articulated motion analysis via axis-based representation
Erdem, Sezen; Tarı, Zehra Sibel (2007-01-01)
Human motion analysis is one of the active research areas in computer vision. The trend shifts from computing motion fields to determining actions. We present an action coding scheme based on a trajectory of features defined with respect to a part based coordinate system. The method does not require prior human model or special motion capture hardware. The features are extracted from images segmented in the form of silhouettes. The feature extraction step ignores 3D effects such as self occlusions or motion...
Citation Formats
G. Tüfekci, “Human action recognition for various input characteristics using 3 dimensional residual networks,” Thesis (M.S.) -- Graduate School of Natural and Applied Sciences. Electrical and Electronics Engineering., Middle East Technical University, 2019.