Analysis of Data Preprocessing Techniques with Software Bug Model Datasets

Salahuddin Shaikha, Liu Changanb , Maaz Rasheed Malikc

Salahuddin Shaikha, Liu Changanb , Maaz Rasheed Malikc

Abstract

In reality, software metrics dataset is highly complex for us to situate lucid distinguish among
the software defective and non-defective modules. Software metrics is imperative to preprocess
the metrics data from the perspective of data model revolution. We have highlighted the data
preprocessing methods and its experiential study on software metrics dataset. These data
preprocessing methods can be used to enhance the efficiency and effectiveness of the classifier in
software bug datasets for overcome classification problems. Experiments analysis of the feature
selection shows us that the enhancement of the TP rate as well as Positive Accuracy could not
increase, but declined as compared to the usage of preprocessing in majority scenarios of
software bug predictions when contrasted against other techniques. During using of Classifiers,
here we have analyzed that Random Forests, IBK, Logistic and Naïve Baes are the benefits for
improving the efficiency and better than all other classifiers for software bug model datasets.
Here Propositionalization efficiency is better than other methods and especially
propositionalization by function logistic classifier