Combining Differential Algorithm with Node Centrality Criterion for Increasing the Accuracy of High Dimension Data Selection

Azad Shojaei , Javad Mohammadzadeh , Keyhan Khamforoosh

Azad Shojaei , Javad Mohammadzadeh , Keyhan Khamforoosh

Abstract

Data is usually described by a large number of features. Many of these features may be
unrelated and redundant for data mining applications. The presence of many of these
unrelated and redundant features in a dataset negatively affects the performance of the
machine learning algorithm and also increases the computational complexity. Therefore,
reducing the size of a dataset is a fundamental task in data mining and machine learning
applications. The main purpose of this study is to combine the node centrality criterion and
differential evolution algorithm to increase the accuracy of feature selection. The proposed
method as well as the performance dataset of the proposed method was compared with the
most recent and well-known feature selection methods. Different criteria such as
classification accuracy, number of selected attributes, as well as execution time were used
to compare different methods. The results of the comparison of different methods were
presented in different shapes and tables and the results were analyzed in full. And from the
statistical point of view and using different statistical tests like Friedman different methods
were compared with each other. The results showed that the selected evolutionary
differential algorithm for clustering, instead of finding all the elements of the cluster centers
present in the data set, found only a limited number of DCT coefficients of these centers
and then reconstructed using the same cluster centers coefficients.