Multi-modal Egocentric Activity Recognition Through Decision Fusion

Arabacı, Mehmet Ali
The usage of wearable devices has rapidly grown in daily life with the development of sensor technologies. The most prominent information for wearable devices is collected from optics which produces videos from an egocentric perspective, called First Person Vision (FPV). FPV has different characteristics from third-person videos because of the large amounts of ego-motions and rapid changes in scenes. Vision-based methods designed for third-person videos where the camera is away from events and actors, cannot be directly applied to egocentric videos. Therefore, new approaches, which are capable of analyzing egocentric videos and accurately fusing inputs from various sensors for specified tasks, should be proposed. In this thesis, we proposed two novel multi-modal decision fusion frameworks for egocentric activity recognition. The first framework combines hand-crafted features using Multi-Kernel Learning. The other framework utilizes deep features using a two-stage decision fusion mechanism. The experiments revealed that combining multiple modalities, such as visual, audio, and other wearable sensors, increased activity recognition performance. In addition, numerous features extracted from different modalities were evaluated within the proposed frameworks. Lastly, a new egocentric activity dataset, named Egocentric Outdoor Activity Dataset (EOAD), was populated, containing 30 different egocentric activities and 1392 video clips.


Multi-modal Egocentric Activity Recognition using Audio-Visual Features
Arabacı, Mehmet Ali; Özkan, Fatih; Sürer, Elif; Jancovic, Peter; Temizel, Alptekin (2018-07-01)
Egocentric activity recognition in first-person videos has an increasing importance with a variety of applications such as lifelogging, summarization, assisted-living and activity tracking. Existing methods for this task are based on interpretation of various sensor information using pre-determined weights for each feature. In this work, we propose a new framework for egocentric activity recognition problem based on combining audio-visual features with multi-kernel learning (MKL) and multi-kernel boosting (...
Multi-modal egocentric activity recognition using multi-kernel learning
Arabaci, Mehmet Ali; Ozkan, Fatih; Sürer, Elif; Jancovic, Peter; Temizel, Alptekin (2020-04-28)
Existing methods for egocentric activity recognition are mostly based on extracting motion characteristics from videos. On the other hand, ubiquity of wearable sensors allow acquisition of information from different sources. Although the increase in sensor diversity brings out the need for adaptive fusion, most of the studies use pre-determined weights for each source. In addition, there are a limited number of studies making use of optical, audio and wearable sensors. In this work, we propose a new framewo...
Heuristic Resource Reservation Policies for Public Clouds in the IoT Era
Gül, Ömer Melih (2022-12-01)
With the advances in the IoT era, the number of wireless sensor devices has been growing rapidly. This increasing number gives rise to more complex networks where more complex tasks can be executed by utilizing more computational resources from the public clouds. Cloud service providers use various pricing models for their offered services. Some models are appropriate for the cloud service user’s short-term requirements whereas the other models are appropriate for the long-term requirements of cloud service...
Enablers for IoT regarding Wearable Medical Devices to Support Healthy Living: The Five Facets
Değerli, Mustafa; Özkan Yıldırım, Sevgi (Springer Nature, 2021-01-01)
Wearables, body sensor networks, ambient, and Internet of Things (IoT) technologies are currently fairly popular in health-related researches and practices. Definitely, wearable technologies are a central fragment of the IoT. Moreover, wearables are becoming more ubiquitous, and they have noteworthy functions and benefits for healthy living and aging. In this context, the success of wearable medical devices is important. Nevertheless, the current understanding in this field needs enhancements. Hence, the au...
Multiple kernel learning for first-person activity recognition
Özkan, Fatih; Temizel, Alptekin; Sürer, Elif; Department of Information Systems (2017)
First-person vision applications have recently gained increasing popularity because of advances in wearable camera technologies. In the literature, existing descriptors have been adapted to the first-person videos or new descriptors have been proposed. These descriptors have been used in a single-kernel method which ignores the relative importance of each descriptor. On the other hand, first-person videos have different characteristics as compared to third-person videos which are captured by static cameras....
Citation Formats
M. A. Arabacı, “Multi-modal Egocentric Activity Recognition Through Decision Fusion,” Ph.D. - Doctoral Program, Middle East Technical University, 2023.