Multi-modal video event recognition based on association rules and decision fusion

2018-02-01
In this paper, we propose a multi-modal event recognition framework based on the integration of feature fusion, deep learning, scene classification and decision fusion. Frames, shots, and scenes are identified through the video decomposition process. Events are modeled utilizing features of and relations between the physical video parts. Event modeling is achieved through visual concept learning, scene segmentation and association rule mining. Visual concept learning is employed to reveal the semantic gap between the visual content and the textual descriptors of the events. Association rules are discovered by a specialized association rule mining algorithm where the proposed strategy integrates temporality into the rule discovery process. In addition to frames, shots and scenes, the concept of scene segment is proposed to define and extract elements of association rules. Various feature sources such as audio, motion, keypoint descriptors, temporal occurrence characteristics and fully connected layer outputs of CNN model are combined into the feature fusion. The proposed decision fusion approach employs logistic regression to formulate the relation between dependent variable (event type) and independent variables (classifiers' outputs) in terms of decision weights. Multi-modal fusion-based scene classifiers are employed in the event recognition. Rule-based event modeling and multi-modal fusion capability are shown to be promising approaches for event recognition. The decision fusion results are promising and the proposed algorithm is open to the fusion of new sources for further improvements. The proposal is also open to new event type integrations. The accuracy of the proposed methodology is evaluated on the CCV and Hollywood2 dataset for event recognition and results are compared with the benchmark implementations in the literature.
MULTIMEDIA SYSTEMS

Suggestions

Multimedia data modeling and semantic analysis by multimodal decision fusion
Güder, Mennan; Çiçekli, Fehime Nihan; Department of Computer Engineering (2015)
In this thesis, we propose a multi-modal event recognition framework based on the integration of event modeling, fusion, deep learning and, association rule mining. Event modeling is achieved through visual concept learning, scene segmentation and association rule mining. Visual concept learning is employed to reveal the semantic gap between the visual content and the textual descriptors of the events. Association rules are discovered by a specialized association rule mining algorithm where the proposed str...
Multi-objective decision making using fuzzy discrete event systems: A mobile robot example
Boutalis, Yiannis; Schmidt, Klaus Verner (2010-09-29)
In this paper, we propose an approach for the multi-objective control of sampled data systems that can be modeled as fuzzy discrete event systems (FDES). In our work, the choice of a fuzzy system representation is justified by the assumption of a controller realization that depends on various potentially imprecise sensor measurements. Our approach consists of three basic steps that are performed in each sampling instant. First, the current fuzzy state of the system is determined by a sensor evaluation. Seco...
Hierarchical multitasking control of discrete event systems: Computation of projections and maximal permissiveness
Schmidt, Klaus Verner; Cury, José E.r. (null; 2010-12-01)
This paper extends previous results on the hierarchical and decentralized control of multitasking discrete event systems (MTDES). Colored observers, a generalization of the observer property, together with local control consistency, allow to derive sufficient conditions for synthesizing modular and hierarchical control that are both strongly nonblocking (SNB) and maximally permissive. A polynomial procedure to verify if a projection fulfills the above properties is proposed and in the case they fail for a g...
Piecewise-planar 3D reconstruction in rate-distortion sense
Imre, Evren; Gueduekbay, Ugur; Alatan, Abdullah Aydın (2007-05-09)
In this paper, a novel rate-distortion optimization inspired 3D piecewise-planar reconstruction algorithm is proposed. The algorithm refines a coarse 3D triangular mesh, by inserting vertices in a way to minimize the intensity difference between an image and its prediction. The preliminary experiments on synthetic and real data indicate the validity of the proposed approach.
Representing temporal knowledge in connectionist expert systems
Alpaslan, Ferda Nur (1996-09-27)
This paper introduces a new temporal neural networks model which can be used in connectionist expert systems. Also, a Variation of backpropagation algorithm, called the temporal feedforward backpropagation algorithm is introduced as a method for training the neural network. The algorithm was tested using training examples extracted from a medical expert system. A series of experiments were carried out using the temporal model and the temporal backpropagation algorithm. The experiments indicated that the alg...
Citation Formats
M. Guder and F. N. Çiçekli, “Multi-modal video event recognition based on association rules and decision fusion,” MULTIMEDIA SYSTEMS, pp. 55–72, 2018, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/62691.