Performance Enhancement of Different Machine Learning Algorithms Using Up-Sampling Technique
Depression is a mental disorder that is one of the major reasons for committing suicide. Sometimes people knowingly or unknowingly post their signs of depression online in different social media platforms. By analyzing the posts in social media, we can help prevent further feelings of depression in these people. Machine learning techniques are making this task possible. With sufficient amount of pre-labelled data, the model can be trained and can be used to predict the new incoming posts in social networking sites. This work focuses on building a model which improves the accuracy of machine learning algorithms in identifying ‘depression’ state. We have used Naïve Bayes and K Nearest Neighbor algorithms for classification. We have tabulated performance parameters like accuracy and F1 score of these algorithms. The proposed model uses up-sampling technique with false positive and false negative data and found that, this approach improves accuracy rate and F1 score of Naïve Base by 0.9% and 3.783% and K Nearest Neighbor by 2.1% and 0.759% respectively on validation dataset. Then we checked our proposed model by changing the dataset features for both Naïve Bayes and K-Nearest Neighbor algorithms and we obtained satisfied results.