MAGiC: A Multimodal Framework for Analysing Gaze in Dyadic Communication

Aydin, Ulku Arslan
Kalkan, Sinan
Acartürk, Cengiz
The analysis of dynamic scenes has been a challenging domain in eye tracking research. This study presents a framework, named MAGiC, for analyzing gaze contact and gaze aversion in face-to-face communication. MAGiC provides an environment that is able to detect and track the conversation partner's face automatically, overlay gaze data on top of the face video, and incorporate speech by means of speech-act annotation. Specifically, MAGiC integrates eye tracking data for gaze, audio data for speech segmentation, and video data for face tracking. MAGiC is an open source framework and its usage is demonstrated via publicly available video content and wild pages. We explored the capabilities of MAGiC through a pilot study and showed that it facilitates the analysis of dynamic gaze data by reducing the annotation effort and the time spent for manual analysis of video data.


Eye tracking in multimodal comprehension of graphs
Acartürk, Cengiz (2012-07-31)
Eye tracking methodology has been a major empirical research approach for the study of online comprehension processes in reading and scene viewing. The use of eye tracking methodology for the study of diagrammatic representations, however, has been relatively limited so far. The investigation of specific types of diagrammatic representations, such as statistical graphs is even scarce. In this study, we propose eye tracking as an empirical research approach for a systematic analysis of multimodal comprehensi...
Bimodal automatic speech segmentation based on audio and visual information fusion
Akdemir, Eren; Çiloğlu, Tolga (2011-07-01)
Bimodal automatic speech segmentation using visual information together with audio data is introduced. The accuracy of automatic segmentation directly affects the quality of speech processing systems using the segmented database. The collaboration of audio and visual data results in lower average absolute boundary error between the manual segmentation and automatic segmentation results. The information from two modalities are fused at the feature level and used in a HMM based speech segmentation system. A T...
Wireless speech recognition using fixed point mixed excitation linear prediction (MELP) vocoder
Acar, D; Karci, MH; Ilk, HG; Demirekler, Mübeccel (2002-07-19)
A bit stream based front-end for wireless speech recognition system that operates on fixed point mixed excitation linear prediction (MELP) vocoder is presented in this paper. Speaker dependent, isolated word recognition accuracies obtained from conventional and bit stream based front-end systems are obtained and their statistical significance is discussed. Feature parameters are extracted from original (wireline) and decoded speech (conventional) and from the quantized spectral information (bit stream) of t...
3-D humanoid gait simulation using an optimal predictive control
Özyurt, Gökhan; Özgören, Mustafa Kemal; Department of Mechanical Engineering (2005)
In this thesis, the walking of a humanoid system is simulated applying an optimal predictive control algorithm. The simulation is built using Matlab and Simulink softwares. Four separate physical models are developed to represent the single support and the double support phases of a full gait cycle. The models are three dimensional and their properties are analogous to the human̕s. In this connection, the foot models in the double support phases include an additional joint which connects the toe to the foot...
Sensor fusion of a camera and 2D LIDAR for lane detection and tracking
Yeniaydın, Yasin; Schmidt, Klaus Verner; Department of Electrical and Electronics Engineering (2019)
This thesis proposes a novel lane detection and tracking algorithm based on sensor fusion of a camera and 2D LIDAR. The proposed method is based on the top down view of a grayscale image, whose lane pixels are enhanced by the convolution with a 1D top-hat kernel. The convolved image is horizontally divided into a predetermined number of regions and the histogram of each region is computed. Next, the highest valued local maxima in a predefined ratio in the histogram plots are determined as candidate lane pix...
Citation Formats
U. A. Aydin, S. Kalkan, and C. Acartürk, “MAGiC: A Multimodal Framework for Analysing Gaze in Dyadic Communication,” JOURNAL OF EYE MOVEMENT RESEARCH, pp. 0–0, 2018, Accessed: 00, 2020. [Online]. Available: