Countvectorizer Enabled Email Classification Using Logistic Regression and Wordcloud System
The aim of the email classication model is to classify user emails into useful and unusefuli.e, ham and spam respectively. A system has been developed to provide meaning and classification of e-mails according to the contents. Tons of supervised and unsupervised algorithms have been already innovated in this domain to accomplish the task and goals of the users and organization. In this system, the algorithm is used to get results more accurately and efficiently. This logistic regression algorithm with countvectorizer is used for the classification of emails into ham or spam. Further, we are using word cloud so that it can be easily observed that what type of emails are majorly received by a user or an organization i.e. either ham emails or spam emails. The results obtained are easy to observe and are more efficient to analyze the outputs.
Keywords: Emails ,Ham ,Spam, Countvectorizer ,Logistic Regression ,WordCloud.