Human action recognition for various input characteristics using 3 dimensional residual networks

Download
2019
Tüfekci, Gülin
Action recognition using deep neural networks is a far-reaching research area which has been commonly utilized in applications such as statistical analysis of human behavior, detecting abnormalities using surveillance cameras and robotic systems. Previous studies have been performing researches to propose new machine learning algorithms and deep network architectures to obtain higher recognition accuracy levels. Instead of suggesting a network resulting in small accuracy gain, this thesis focuses on evaluating different input characteristics for increasing the learning capacity of the networks. To do so, 3-dimensional residual networks are utilized because of their effective learning process. Among all the modifications applied on the inputs, increasing the sample duration up to 60 frames and masking the RGB pixel values with the motion flow between consecutive frames provide high accuracy gains. Employing 60 frames instead of 16 frames quadruples the computation time while achieving an accuracy increase of 10%. Masking the frames results in 12% recognition accuracy gain. Both modifications contribute to the learning process of the network by emphasizing the relations between patterns through longer temporal extents and guiding the network to focus on the areas where the main action takes place. Obtaining significant amounts of accuracy gains by only modifying the input is outstanding. Moreover, the recognition accuracy is enhanced even more by pre-training the network on a large scale dataset. The contributions of the results of this thesis are worthwhile since the input characteristics yielding high accuracy gains can be used for different networks to increase the recognition accuracy.

Suggestions

Human body part detection and multi-human tracking in surveillance videos
Topçu, Hasan Hüseyin; Çiçekli, Fehime Nihan; Ulusoy, İlkay; Department of Computer Engineering (2012)
With the recent developments in Computer Vision and Pattern Recognition, surveillance applications are equipped with the capabilities of event/activity understanding and interpretation which usually require recognizing humans in real world scenes. Real world scenes such as airports, streets and train stations are complex because they involve many people, complicated occlusions and cluttered backgrounds. Although complex real world scenes exist, human detectors have the capability to locate pedestrians accur...
A comparison on textured motion classification
Oztekin, Kaan; Akar, Gözde (2006-01-01)
Textured motion - generally known as dynamic or temporal texture analysis, classification, synthesis, segmentation and recognition is popular research areas in several fields such as computer vision, robotics, animation, multimedia databases etc. In the literature, several algorithms are proposed to characterize these textured motions such as stochastic and deterministic algorithms. However, there is no study which compares the performances of these algorithms. In this paper, we carry out a complete compari...
Deep Hierarchies in the Primate Visual Cortex: What Can We Learn for Computer Vision?
KRÜGER, Norbert; JANSSEN, Peter; Kalkan, Sinan; LAPPE, Markus; LEONARDİS, Ales; PİATER, Justus; Rodriguez-Sanchez, Antonio J.; WİSKOTT, Laurenz (Institute of Electrical and Electronics Engineers (IEEE), 2013-08-01)
Computational modeling of the primate visual system yields insights of potential relevance to some of the challenges that computer vision is facing, such as object recognition and categorization, motion detection and activity recognition, or vision-based navigation and manipulation. This paper reviews some functional principles and structures that are generally thought to underlie the primate visual cortex, and attempts to extract biological principles that could further advance computer vision research. Or...
Face Recognition Based on Embedding Learning
Karaman, Kaan; Koc, Aykut; Alatan, Abdullah Aydın (2018-09-11)
Face recognition is a key task of computer vision research that has been employed in various security and surveillance applications. Recently, the importance of this task has risen with the improvements in the quality of sensors of cameras, as well as with the increasing coverage of camera networks setup everywhere in the cities. Moreover, biometry-based technologies have been developed for the last three decades and have been available on many devices such as the mobile phones. The goal is to identify peop...
Multilevel Object Tracking in Wireless Multimedia Sensor Networks for Surveillance Applications Using Graph-Based Big Data
Kucukkececi, Cihan; Yazıcı, Adnan (Institute of Electrical and Electronics Engineers (IEEE), 2019-01-01)
Wireless Multimedia Sensor Networks (WMSN), for object tracking, have been used as an emerging technology in different application areas, such as health care, surveillance, and traffic control. In surveillance applications, sensor nodes produce data almost in real-time while tracking the objects in a critical area or monitoring border activities. The generated data is generally treated as big data and stored in NoSQL databases. In this paper, we present a new object tracking approach for surveillance applic...
Citation Formats
G. Tüfekci, “Human action recognition for various input characteristics using 3 dimensional residual networks,” Thesis (M.S.) -- Graduate School of Natural and Applied Sciences. Electrical and Electronics Engineering., Middle East Technical University, 2019.