Hide/Show Apps

Comparative analysis of hidden Markov models for multi-modal dialogue scene indexing

Alatan, Abdullah Aydın
Akansu, AN
Wolf, W
A class of audio-visual content is segmented into dialogue scenes using the state transitions of a novel hidden Markov model (HMM). Each shot is classi ed using both audio track and visual content to determine the state/scene transitions of the model. After simulations with circular and left-to-right HMM topologies, it is observed that both are performing very good with multi-modal inputs. More- over, for circular topology, the comparisons between different training and observation sets show that audio and face information together gives the most consistent results among different observation sets.