A gaze-centered multimodal approach to human-human social interaction

Acartürk, Cengiz
Kalkan, Sinan
Aydin, Ulku Arslan
This study aims at investigating gaze aversion behavior in human-human dyads during the course of a conversation. Our goal is to identify the parametric infrastructure, which will underlie the development of gaze behavior in Human Robot Interaction. We employed a job interview setting, where pairs (an interviewer and an interviewee) conducted mock job interviews. Three pairs of native speakers took part in the experiment. Two eye-tracking glasses recorded the scene video, the audio and the eye gaze positions of the participants. The analyses involved synchronization of multimodal data, including video recording data for face tracking, gaze data from the eye trackers, and the audio data for speech segmentation. We investigated frequency, duration, timing and spatial positions of gaze aversions relative to interlocutor's face. The results revealed that the interviewees perform more frequent gaze aversion compared to the interviewers. Moreover, gaze aversion takes longer when accompanied by speech. Also, specific speech instances, such as pause and speech-end signals have significant impact on gaze aversion behavior.
Citation Formats
C. Acartürk, S. Kalkan, and U. A. Aydin, “A gaze-centered multimodal approach to human-human social interaction,” presented at the 3rd IEEE International Conference on Cybernetics (CYBCONF), Exeter, ENGLAND, 2017, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/30991.