Imbalanced Data Learning With a Novel Ensemble Technique: Extrapolation-SMOTE SVM Bagging

Sakshi Hooda, Suman Mann

Sakshi Hooda, Suman Mann

Abstract

In real life, there is ubiquitous existence of class imbalance. This issue has attracted the interest of various scholars in different fields. Whereas imbalanced dataset may provide room for direct learning, this situation yields unsatisfactory outcomes because most of the focus is on how accurate a suboptimal model is identified and derived. Therefore, various approaches have been proposed in a quest to address the issue. Some of these methods include cost-sensitive, sampling, and other hybrid techniques. Despite this promising trend, however, emphasis needs to be on samples occurring closer to decision boundaries with more discriminative data are worth valuing and that the boundary’s skew correction could be realized through synthetic sample construction. With geometry’s sense of truth proving inspirational, the aim of this study was to develop a new technique for synthetic minority sampling, seeking to incorporate borderline data. Indeed, ensemble framework has been documented to capture both robust and complicated decision boundaries in the real-world. With these factors considered, Bagging of Extrapolation Borderline-SMOTE SVM (BEBS) was proposed in a quest to handle problems associated with IDL (imbalanced data learning). From the experimental outcomes, which arose from the focus on open access datasets, there was significant superior performance when the proposed BEBS framework was implemented. The uniqueness of the model was found to lie in its ability to solve the selected problems by combining borderline data and ensemble of SVMs.