Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Comparison of deep networks for gesture recognition
Download
thesisBugraSofu28092021.pdf
Date
2021-9-06
Author
Sofu, Buğra
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
328
views
247
downloads
Cite This
Gesture recognition is an important problem and has been studied over the years especially in the fields such as surveillance systems, analysis of human behavior, robotics etc. In this thesis, different state of art algorithms, which are based on deep learning, were implemented and compared considering model complexities and accuracies. Also, a new approach was proposed and compared with them. Tested algorithms can be classified into two main categories: hybrid approaches, which use CNN and LSTM architectures successively, and three dimensional convolutional neural networks (3D-CNNs). For the hybrid approaches, we studied CNN-LSTM models and investigated the effect of different feature extractors such as Inception-V3 and ResNext50 models. For the ResNext50 architecture, additional to original network, we included an attention model called Squeeze and Excitation Block (SE). By this new approach, 21% accuracy increase was reached while the number of parameters was decreased, which means less model complexity than the original approach. For the 3D-CNNs, I3D model, which has pre-trained ImageNet weights, was applied and compared with C3D models, which cannot use ImageNet weights directly. Ability to use ImageNet weights gives the advantage of fast training, since network is initialized with ImageNet features, and can also result in a more accurate and effective model overall. 16.5% accuracy increase was obtained for the 3D-CNN architecture when I3D model was trained on Kinetics dataset.
Subject Keywords
Gesture Recognition
,
Hybrid Networks
,
3D-CNNs
,
Two Stream Networks
URI
https://hdl.handle.net/11511/93021
Collections
Graduate School of Natural and Applied Sciences, Thesis
Suggestions
OpenMETU
Core
Human action recognition for various input characteristics using 3 dimensional residual networks
Tüfekci, Gülin; Ulusoy, İlkay; Department of Electrical and Electronics Engineering (2019)
Action recognition using deep neural networks is a far-reaching research area which has been commonly utilized in applications such as statistical analysis of human behavior, detecting abnormalities using surveillance cameras and robotic systems. Previous studies have been performing researches to propose new machine learning algorithms and deep network architectures to obtain higher recognition accuracy levels. Instead of suggesting a network resulting in small accuracy gain, this thesis focuses on evaluat...
Comparison of Cuboid and Tracklet Features for Action Recognition on Surveillance Videos
Bayram, Ulya; Ulusoy, İlkay; Cicekli, Nihan Kesim (2013-01-01)
For recognition of human actions in surveillance videos, action recognition methods in literature are analyzed and coherent feature extraction methods that are promising for success in such videos are identified. Based on local methods, most popular two feature extraction methods (Dollar's "cuboid" feature definition and Raptis and Soatto's "tracklet" feature definition) are tested and compared. Both methods were classified by different methods in their original applications. In order to obtain a more fair ...
SWARM-based data delivery in Social Internet of Things
Hasan, Mohammed Zaki; Al-Turjman, Fadi (Elsevier BV, 2019-03-01)
Social Internet of Things (SIoTs) refers to the rapidly growing network of connected objects and people that are able to collect and exchange data using embedded sensors. To guarantee the connectivity among these objects and people, fault tolerance routing has to be significantly considered. In this paper, we propose a bio-inspired particle multi-swarm optimization (PMSO) routing algorithm to construct, recover and select k-disjoint paths that tolerates the failure while satisfying quality of service (QoS) ...
Object Recognition via Local Patch Labelling
Ulusoy, İlkay (2005-03-01)
In recent years the problem of object recognition has received considerable attention from both the machine learning and computer vision communities. The key challenge of this problem is to be able to recognize any member of a category of objects in spite of wide variations in visual appearance due to variations in the form and colour of the object, occlusions, geometrical transformations (such as scaling and rotation), changes in illumination, and potentially non-rigid deformations of the object itself. In...
Generation and modification of 3D models with deep neural networks
Öngün, Cihan; Temizel, Alptekin; Department of Information Systems (2021-9)
Artificial intelligence (AI) and particularly deep neural networks (DNN) have become very hot topics in the recent years and they have been shown to be successful in problems such as detection, recognition and segmentation. More recently DNNs have started to be popular in data generation problems by the invention of Generative Adversarial Networks (GAN). Using GANs, various types of data such as audio, image or 3D models could be generated. In this thesis, we aim to propose a system that creates artificial...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
B. Sofu, “Comparison of deep networks for gesture recognition,” M.S. - Master of Science, Middle East Technical University, 2021.