Dynamic gaze analysis: An application enviroment for face-to-face communication

Ülkü, Arslan Aydın
Kalkan, Sinan
Acartürk, Cengiz
2017 International Artificial Intelligence and Data Processing Symposium (IDAP)


MAGiC: A Multimodal Framework for Analysing Gaze in Dyadic Communication
Aydin, Ulku Arslan; Kalkan, Sinan; Acartürk, Cengiz (2018-01-01)
The analysis of dynamic scenes has been a challenging domain in eye tracking research. This study presents a framework, named MAGiC, for analyzing gaze contact and gaze aversion in face-to-face communication. MAGiC provides an environment that is able to detect and track the conversation partner's face automatically, overlay gaze data on top of the face video, and incorporate speech by means of speech-act annotation. Specifically, MAGiC integrates eye tracking data for gaze, audio data for speech segmentati...
Bimodal automatic speech segmentation based on audio and visual information fusion
Akdemir, Eren; Çiloğlu, Tolga (2011-07-01)
Bimodal automatic speech segmentation using visual information together with audio data is introduced. The accuracy of automatic segmentation directly affects the quality of speech processing systems using the segmented database. The collaboration of audio and visual data results in lower average absolute boundary error between the manual segmentation and automatic segmentation results. The information from two modalities are fused at the feature level and used in a HMM based speech segmentation system. A T...
Performance evaluation of real-time noisy speech recognition for mobile devices
Yurtcan, Yaser; Günel Kılıç, Banu; Department of Information Systems (2019)
Communication is important for people. There are many available communication methods. One of the most effective methods is through the use of speech. People can comfortably express their feelings and thoughts by using speech. However, some people may have a hearing problem. Furthermore, understanding spoken words in a noisy environment could be a challenge even for healthy people. Speech recognition systems enable real-time speech to text conversion. They mainly involve capturing of the sound waves and con...
Multilingual Video Indexing and Retrieval Employing an Information Extraction Tool for Turkish News Texts: A Case Study
Kucuk, Dilek; Yazıcı, Adnan (2011-10-28)
In this paper, a multilingual video indexing and retrieval system is proposed which relies on an information extraction tool, a hybrid named entity recognizer, for Turkish to determine the semantic annotations for the considered videos. The system is executed on a set of news videos in English and encompasses several other components including an automatic speech recognition system for English, an English-to-Turkish machine translation system, a news video database, and a semantic video retrieval interface....
Elderly Speech-Gaze Interaction State of the Art and Challenges for Interaction Design
Acartürk, Cengiz; Fal, Mehmetcal; Dias, Miguel Sales (2015-08-07)
Elderly people face problems when using current forms of Human-Computer Interaction (HCI). Developing novel and natural methods of interaction would facilitate resolving some of those issues. We propose that HCI can be improved by combining communication modalities, in particular, speech and gaze, in various ways. This study presents elderly speech-gaze interaction as a novel method in HCI, a review of literature for its potential of use, and discusses possible domains of application for further empirical i...
