Outlier Detection Based on Machine Learning Techniques
Abstract
Outliers are being researched in many fields of research and various domains. In this paper, we analyse and bring together various outlier detection techniques. With this, we hope to attain a better understanding of the different approaches of research on outlier detection. The goal of this project was to detect the outliers of the housing prices in Melbourne (Australia), using statistical and Machine Learning prediction models. The type of Machine Learning implemented was unsupervised learning for all models. The models used were Isolation Forest, Elliptic Envelope, Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Local Outlier Factor (LOF). The results of each model were visualised for multivariate data to detect outliers. Outlier Detection was performed on univariate and multivariate data. A dummy data frame was created by 1000 random observation values and 4 features to perform parametric methods on univariate and multivariate data.
Keywords: Outliers, Outlier Detection, Univariate, Multivariate.