Hide/Show Apps

Novel approach to emotion recognition in voice: a convolutional neural network approach and grad-cam generation

Canpolat, Salih Fıra
Emotion is one of the essential components in human and human-machine interaction. One of the most common communication channels is the sound. Understanding the underlying mechanisms of emotion recognition in the sound signal is an essential step in improving both types of interaction. For this purpose, we developed an emotion recognition model, and a Turkish-specific database, referred to as the Turkish Emotion-Voice (TurEV) database. The database contains one-word-vocalizations of four emotion types; angry, calm, happy, and sad in three different frequency bands. The model was trained using TurEV, and human validation studies were conducted. The results indicate that the model is feasible for emotion recognition tasks. The comparison of the humans with the computational model indicate that the model achieves better results using feature-rich frequency bands, the humans use all other aspects of the sound signal.