Speech Emotion Detection Using Convolutional Neural Network
Handling technologies and search engines with speech becomes as a trend in recent days and there are voice assistants implemented in many areas like cars, mobiles, TV remotes (in case of android TV) etc. And there are many call centers for various purposes working 24x7. As the speech recognition systems works up to the mark but still there is a gap between the human and the machine in order to fill that gap we are moving towards emotion recognition. Existing models on speech emotion recognition rely on Support Vector Machine (SVM) and they reported accuracy of 60% and they involved many pre-processing steps. To overcome this draw back we decided to augment the audio using time shifting and change in speed technique and classify it with the Convolutional Neural Network to improve the accuracy and efficiency of the emotion detection problem. There are several recognized datasets such as eNTERFACE, SAVEE, RAVDESS, CASIA and few more. We are using SAVEE dataset for training and once the augmentation is done and the classifier gets trained then we can give our live voice as input and it will detect our emotion.