Comparative Study of Convolutional Neural Network and Support Vector Machine for Emotion Detection from Speech Signal
Human emotion recognition plays a key role in developing interpersonal relationship. Depiction of emotions is done by speech, hand and gestures of the body and through facial expressions. Speech Emotion Recognition (SER) has been a topic of research since many years in human machine interface application. Developments of many systems have taken place for solely identifying emotions. This paper explains how a real time system works for detection of a person’s emotion from speech. The classifiers used can predict emotions such as Happy, Angry, Fear, Calm and likewise. The databases used for the speech emotion recognition system are Ryerson Audio-Visual Database (RAVDESS) and Toronto emotional speech set (TESS). The features extracted from these datasets are Energy, Zero Crossing Rate, Mel frequency cepstrum coefficient (MFCC). The study is based on comparison of two classification models: Convolutional Neural Network (CNN) and Support Vector Machine (SVM). To enrich the interface with the system, Tkinter Python package is used for building a Graphical User Interface (GUI).