A Comparative Study of Stochastic Gradient Descent and Naïve Bayes Multinomial for Text Classification on Spam Words
Abstract
E-mails are a versatile method for communication in the current day and age. It allows a person to stay connected with people anywhere in the world. Nowadays the use of E-Mails has skyrocketed, and the number of users that are unaware of the maliciousness that lies within the online world is also immense. There are several instance where a user has their sensitive information compromised or have lost their resources to these malicious messages. These types of messages are spam messages. We aim to find whether Naive Bayes multinomial is better or not in cases where the dataset is very large, surpassing Stochastic Gradient Descent at a certain point, so as to find out which is the better technique for spam filtering.