A Novel Approach for Text Classification using Recurrent Neural Networks

Divanu Sameera, Vedavathi K, Durga Prasad Kavadi

Divanu Sameera, Vedavathi K, Durga Prasad Kavadi

Abstract

Text classification also known as text categorization is a classical task in natural languageprocessing. The main aim of text classification is assigning one or more predefined classes or categories to textdocuments. Text classification has a wide variety of applications like email classification, opinion classification and news article classification.Traditionally, several researchers proposed different types of approaches for text classification in different domains. In general, most of the approaches contain a sequence of steps like training data collection, pre-processing of training data, extraction of features, selection of features, representation of training documents and selection of classification algorithms. In these steps, feature selection is one important step to identify the important features in the process of text classification. In this work, a new feature selection technique is proposed to identify the prominent features to improve the accuracy of text classification. The experiment conducted on AG news article dataset using three classifiers such as Support Vector Machine, Naive Bayes Multinomial and Random forest. The Random Forest classifier attained good accuracy for text classification among three classifiers. To improve the accuracy, experiment continued with two deep learning techniques such as Long Short Term Memory and Recurrent Neural Networks and observed that the former technique achieved good accuracy for text classification. The accuracies obtained in this work is promising than most of the approaches in text classification.