Early Prediction of Coronary Heart Disease from Cleveland Dataset using Machine Learning Techniques
The mortality rate of the person affected by heart disease is kept on increasing day by day to a greater extent. Also, the survival rate of approximately 50% of the patient suffering from heart disease is approximately less than 10 years .The exponential growth rate of high-dimensional data in the medical domain needs automation for analysing the data. Hence it is necessary to have an effective computational intelligent system to detect and predict heart disease in advance. Angiography is an imaging test taken for predicting heart disease. It incurs high cost and severe side effects to the patients. Also, it requires experts to diagnose the patient data . To facilitate the process, we proposed an effective computational intelligent system that integrates Principal Component Analysis (PCA) and machine learning classifier models such as k-Nearest Neighbour (k-NN), Support Vector Machine (SVM) and Logistic Regression (LR) to predict the person is affected by heart disease or not . To validate the performance of model, performance measures such as accuracy, specificity, sensitivity, error rate, and Mathews Correlation Coefficient are used. These performance measures of our proposed system provide promising results in heart disease prediction. Among the three classifier models, LR outperforms in all aspects of performance measures and provides comparatively similar results when compared to the SVM. Our proposed system efficiently provides remarkable results in predicting coronary heart disease. In the future, we are focusing to gather and investigate real-world datasets and apply diverse mixture of machine learning techniques.
Keywords: Machine Learning, Coronary Heart Disease, Support Vector Machine, K-Nearest Neighbour, Logistic Regression, Principal Component Analysis.