Human behavior understanding using video analysis

Download
2016
Gökçe, Celal Onur
In this study we proposed a new hierarchical architecture for solution of human behavior understanding problem. A new dataset, namely football video game (FVG) dataset, is generated which involves activities more complex than any other dataset present in the literature. Football ball is detected using multilayer neural networks trained with gradient descent algorithm and enhanced using learning with queries method and it is tracked using growing window algorithm. After region of interest is extracted around ball detected and tracked, primitive action is recognized using one of three types of approaches. First one is based on the well-known Dollar et.al. cuboid features.The second one is mixture of poses approach proposed in this study. Third is an extension to mixture of poses, where fisher vector is employed for the representation of mixture of poses in vectorel form. Primitive action sequences found sequentially in this layer are fed to higher layer activity recognition layer. In the activity recognition layer one of two types of approaches are used. One is well known Hidden Markov Model (HMM), known to work well on time series data and the other is Context Free Grammar (CFG) which can theoretically recognize more complex sequences that HMM can not. Another novelty of this study is new activity types are learnt using grammar induction with the Cook, Yunger and Kasami (CYK) algorithm. So, this way either pre-taught activity types can be recognized or new activity types can be learnt. The FVG dataset that we generated is tested with four combinations of approaches and encouraging results are obtained. For primitive action recognition, the proposed MoP algorithm has success rate of 69.3%, clearly exceeding widely referenced Dollar et. al. cuboid features which achieves success rate of 54.0%. Employing vi Fisher vector with MoP slightly decreases performance to 67.0 % while achieving high speed. For complex activity recognition, MoP-CFG pair achieves success rate of 62.5%, same with MoP-HMM pair. They clearly exceed cuboid-CFG/HMM pairs which achieves success rate of 42.5%.There are activities those can be recognized by CFG but not by HMM. An example for these is passing the ball around activity.

Suggestions

Human activity classification using spatio-temporal feature relations
Akpınar, Kutalmış; Ulusoy, İlkay; Department of Electrical and Electronics Engineering (2012)
This thesis compares the state of the art methods and proposes solutions for human activity classification from video data. Human activity classification is finding the meaning of human activities, which are captured by the video. Classification of human activity is needed in order to improve surveillance video analysis and summarization, video data mining and robot intelligence. This thesis focuses on the classification of low level human activities which are used as an important information source to dete...
Human activity recognition by Gait Analysis
Kepenekci, Burcu; Akar, Gözde; Department of Electrical and Electronics Engineering (2011)
This thesis analyzes the human action recognition problem. Human actions are modeled as a time evolving temporal texture. Gabor filters, which are proved to be a robust 2D texture representation tool by detecting spatial points with high variation, is extended to 3D domain to capture motion texture features. A well known filtering algorithm and a recent unsupervised clustering algorithm, the Genetic Chromodynamics, are combined to select salient spatio-temporal features of the temporal texture and to segmen...
Use of probability hypothesis density filter for human activity recognition
Günay, Elif Erdem; Akar, Gözde; Demirekler, Mübeccel; Department of Electrical and Electronics Engineering (2016)
This thesis addresses a Gaussian Mixture Probability Hypothesis Density (GMPHD) based probabilistic group tracking approach to human action recognition problem. First of all, feature set of the video images denoted as observations are obtained by applying Harris Corner Detector(HCD) technique following a GMPHD lter, which is a state-of-the-art target tracking method. Discriminative information is extracted from the output of the GM-PHD lter and using these, recognition features are constructed related to ...
Positive impact of state similarity on reinforcement learning performance
Girgin, Sertan; Polat, Faruk; Alhaj, Reda (Institute of Electrical and Electronics Engineers (IEEE), 2007-10-01)
In this paper, we propose a novel approach to identify states with similar subpolicies and show how they can be integrated into the reinforcement learning framework to improve learning performance. The method utilizes a specialized tree structure to identify common action sequences of states, which are derived from possible optimal policies, and defines a similarity function between two states based on the number of such sequences. Using this similarity function, updates on the action-value function of a st...
Vision based obstacle detection and avoidance using low level image features
Senlet, Turgay; Halıcı, Uğur; Department of Electrical and Electronics Engineering (2006)
This study proposes a new method for obstacle detection and avoidance using low-level MPEG-7 visual descriptors. The method includes training a neural network with a subset of MPEG-7 visual descriptors extracted from outdoor scenes. The trained neural network is then used to estimate the obstacle presence in real outdoor videos and to perform obstacle avoidance. In our proposed method, obstacle avoidance solely depends on the estimated obstacle presence data. In this study, backpropagation algorithm on mult...
Citation Formats
C. O. Gökçe, “Human behavior understanding using video analysis,” Ph.D. - Doctoral Program, Middle East Technical University, 2016.