Comparison of deep networks for gesture recognition

2021-9-06
Sofu, Buğra
Gesture recognition is an important problem and has been studied over the years especially in the fields such as surveillance systems, analysis of human behavior, robotics etc. In this thesis, different state of art algorithms, which are based on deep learning, were implemented and compared considering model complexities and accuracies. Also, a new approach was proposed and compared with them. Tested algorithms can be classified into two main categories: hybrid approaches, which use CNN and LSTM architectures successively, and three dimensional convolutional neural networks (3D-CNNs). For the hybrid approaches, we studied CNN-LSTM models and investigated the effect of different feature extractors such as Inception-V3 and ResNext50 models. For the ResNext50 architecture, additional to original network, we included an attention model called Squeeze and Excitation Block (SE). By this new approach, 21% accuracy increase was reached while the number of parameters was decreased, which means less model complexity than the original approach. For the 3D-CNNs, I3D model, which has pre-trained ImageNet weights, was applied and compared with C3D models, which cannot use ImageNet weights directly. Ability to use ImageNet weights gives the advantage of fast training, since network is initialized with ImageNet features, and can also result in a more accurate and effective model overall. 16.5% accuracy increase was obtained for the 3D-CNN architecture when I3D model was trained on Kinetics dataset.

Suggestions

SWARM-based data delivery in Social Internet of Things
Hasan, Mohammed Zaki; Al-Turjman, Fadi (Elsevier BV, 2019-03-01)
Social Internet of Things (SIoTs) refers to the rapidly growing network of connected objects and people that are able to collect and exchange data using embedded sensors. To guarantee the connectivity among these objects and people, fault tolerance routing has to be significantly considered. In this paper, we propose a bio-inspired particle multi-swarm optimization (PMSO) routing algorithm to construct, recover and select k-disjoint paths that tolerates the failure while satisfying quality of service (QoS) ...
Object Recognition via Local Patch Labelling
Ulusoy, İlkay (2005-03-01)
In recent years the problem of object recognition has received considerable attention from both the machine learning and computer vision communities. The key challenge of this problem is to be able to recognize any member of a category of objects in spite of wide variations in visual appearance due to variations in the form and colour of the object, occlusions, geometrical transformations (such as scaling and rotation), changes in illumination, and potentially non-rigid deformations of the object itself. In...
Comparison of Cuboid and Tracklet Features for Action Recognition on Surveillance Videos
Bayram, Ulya; Ulusoy, İlkay; Cicekli, Nihan Kesim (2013-01-01)
For recognition of human actions in surveillance videos, action recognition methods in literature are analyzed and coherent feature extraction methods that are promising for success in such videos are identified. Based on local methods, most popular two feature extraction methods (Dollar's "cuboid" feature definition and Raptis and Soatto's "tracklet" feature definition) are tested and compared. Both methods were classified by different methods in their original applications. In order to obtain a more fair ...
Human action recognition for various input characteristics using 3 dimensional residual networks
Tüfekci, Gülin; Ulusoy, İlkay; Department of Electrical and Electronics Engineering (2019)
Action recognition using deep neural networks is a far-reaching research area which has been commonly utilized in applications such as statistical analysis of human behavior, detecting abnormalities using surveillance cameras and robotic systems. Previous studies have been performing researches to propose new machine learning algorithms and deep network architectures to obtain higher recognition accuracy levels. Instead of suggesting a network resulting in small accuracy gain, this thesis focuses on evaluat...
Face Recognition Based on Embedding Learning
Karaman, Kaan; Koc, Aykut; Alatan, Abdullah Aydın (2018-09-11)
Face recognition is a key task of computer vision research that has been employed in various security and surveillance applications. Recently, the importance of this task has risen with the improvements in the quality of sensors of cameras, as well as with the increasing coverage of camera networks setup everywhere in the cities. Moreover, biometry-based technologies have been developed for the last three decades and have been available on many devices such as the mobile phones. The goal is to identify peop...
Citation Formats
B. Sofu, “Comparison of deep networks for gesture recognition,” M.S. - Master of Science, Middle East Technical University, 2021.