Missing values analysis techniques in Data mining: Review

  • Mohammed Sharik U Zama et al.

Abstract

Missing data is a prevalent problem in data ana- lytics. Researches and surveys often have missing data in their observations. Having missing data in the data set affects the quality of the data set dramatically. In real-world databases and data warehouses, the data is inaccurate, incomplete and inconsistent. There can be numerous reasons behind this such   as human or computer errors in the data entry procedure, purposefully submitting incorrect answers, faulty measurements, and many more. Missing data can have several negative effects on the knowledge discovery process such as biased results, invalid conclusions, and so on. Analyzing the data becomes an arduous task when there are missing data in the dataset. The main reason being, data mining algorithms primarily perform well on dataset that is consistent and complete. Luckily, this problem can be solved with the help of several techniques that can be employed in the data preprocessing stage to handle missing data. The purpose of this research paper is to compare and classify methods to handle missing data. Results from this study are the comparison, classification and contrasting of methods to handle missing data along with the advantages and disadvantages of each method.

Published
2019-11-15
How to Cite
et al., M. S. U. Z. (2019). Missing values analysis techniques in Data mining: Review. International Journal of Advanced Science and Technology, 28(15), 377 - 382. Retrieved from http://sersc.org/journals/index.php/IJAST/article/view/1638
Section
Articles