Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Multi-modal Egocentric Activity Recognition using Audio-Visual Features
Download
index.pdf
Date
2018-07-01
Author
Arabacı, Mehmet Ali
Özkan, Fatih
Sürer, Elif
Jancovic, Peter
Temizel, Alptekin
Metadata
Show full item record
Item Usage Stats
226
views
117
downloads
Cite This
Egocentric activity recognition in first-person videos has an increasing importance with a variety of applications such as lifelogging, summarization, assisted-living and activity tracking. Existing methods for this task are based on interpretation of various sensor information using pre-determined weights for each feature. In this work, we propose a new framework for egocentric activity recognition problem based on combining audio-visual features with multi-kernel learning (MKL) and multi-kernel boosting (MKBoost). For that purpose, firstly grid optical-flow, virtual-inertia feature, log-covariance, cuboid are extracted from the video. The audio signal is characterized using a "supervector", obtained based on Gaussian mixture modelling of frame-level features, followed by a maximum a-posteriori adaptation. Then, the extracted multi-modal features are adaptively fused by MKL classifiers in which both the feature and kernel selection/weighing and recognition tasks are performed together. The proposed framework was evaluated on a number of egocentric datasets. The results showed that using multi-modal features with MKL outperforms the existing methods.
Subject Keywords
Egocentric
,
First-person vision
,
Activity recognition
,
Multi-kernel learning
URI
https://hdl.handle.net/11511/79386
DOI
https://doi.org/10.1007/s11042-020-08789-7
Collections
Graduate School of Informatics, Article
Suggestions
OpenMETU
Core
Multi-modal egocentric activity recognition using multi-kernel learning
Arabaci, Mehmet Ali; Ozkan, Fatih; Sürer, Elif; Jancovic, Peter; Temizel, Alptekin (2020-04-28)
Existing methods for egocentric activity recognition are mostly based on extracting motion characteristics from videos. On the other hand, ubiquity of wearable sensors allow acquisition of information from different sources. Although the increase in sensor diversity brings out the need for adaptive fusion, most of the studies use pre-determined weights for each source. In addition, there are a limited number of studies making use of optical, audio and wearable sensors. In this work, we propose a new framewo...
Multi-modal Egocentric Activity Recognition Through Decision Fusion
Arabacı, Mehmet Ali; Temizel, Alptekin; Sürer, Elif; Department of Information Systems (2023-1-18)
The usage of wearable devices has rapidly grown in daily life with the development of sensor technologies. The most prominent information for wearable devices is collected from optics which produces videos from an egocentric perspective, called First Person Vision (FPV). FPV has different characteristics from third-person videos because of the large amounts of ego-motions and rapid changes in scenes. Vision-based methods designed for third-person videos where the camera is away from events and actors, canno...
Deep Hierarchies in the Primate Visual Cortex: What Can We Learn for Computer Vision?
KRÜGER, Norbert; JANSSEN, Peter; Kalkan, Sinan; LAPPE, Markus; LEONARDİS, Ales; PİATER, Justus; Rodriguez-Sanchez, Antonio J.; WİSKOTT, Laurenz (Institute of Electrical and Electronics Engineers (IEEE), 2013-08-01)
Computational modeling of the primate visual system yields insights of potential relevance to some of the challenges that computer vision is facing, such as object recognition and categorization, motion detection and activity recognition, or vision-based navigation and manipulation. This paper reviews some functional principles and structures that are generally thought to underlie the primate visual cortex, and attempts to extract biological principles that could further advance computer vision research. Or...
Comparison of histograms of oriented optical flow based action recognition methods
Erciş, Fırat; Ulusoy, İlkay; Department of Electrical and Electronics Engineering (2012)
In the task of human action recognition in uncontrolled video, motion features are used widely in order to achieve subject and appearence invariance. We implemented 3 Histograms of Oriented Optical Flow based method which have a common motion feature extraction phase. We compute an optical flow field over each frame of the video. Then those flow vectors are histogrammed due to angle values to represent each frame with a histogram. In order to capture local motions, The bounding box of the subject is divided...
Fine-Grained Object Recognition and Zero-Shot Learning in Remote Sensing Imagery
Sumbul, Gencer; Cinbiş, Ramazan Gökberk; Aksoy, Selim (2018-02-01)
Fine-grained object recognition that aims to identify the type of an object among a large number of subcategories is an emerging application with the increasing resolution that exposes new details in image data. Traditional fully supervised algorithms fail to handle this problem where there is low betweenclass variance and high within-class variance for the classes of interest with small sample sizes. We study an even more extreme scenario named zero-shot learning (ZSL) in which no training example exists f...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
M. A. Arabacı, F. Özkan, E. Sürer, P. Jancovic, and A. Temizel, “Multi-modal Egocentric Activity Recognition using Audio-Visual Features,” 2018, Accessed: 00, 2021. [Online]. Available: https://hdl.handle.net/11511/79386.