Applications of Neural Network in Speech Recognition

  • Sakshi Singh

Abstract

Human speech is laced with lots of information. Humans have used the speech as a form of communication since the stone age. In contemporary times, it has become increasingly important to exploit this composite information. The modern machine learning algorithms have allowed us to acknowledge and manipulate speech. Emotion recognition is another very powerful tool in numerous fields such as Human-computer interaction, Psychology, Mass multimedia etc. Speech Emotion Recognition (SER) has been a long-term challenge to the research community to strive upon. In this paper, we have proposed deep learning model for classifying emotions in speech. We have used both audio and text features and compared them minutely. We have laid down an extensive account on the comparison between the audio-based such as Convolutional Neural Network, Gated Recurrent Unit Networks and text-based models like A Long Short-Term Memory Network and a Gated Recurrent Unit Network. We have used IEMOCAP dataset in our experiments and enlisted our findings.

Published
2020-08-01
Section
Articles