Water Quality Monitoring for Disease Prediction using Machine Learning

  • Prajakta Patil, Sukanya More, Atharv Deshpande , Harshal Todkar ,Sanjeev Wagh


Access to pure drinking water and sanitation has been marked as a fundamental human right as „The Human Right to Water and Sanitation‟ by the United Nations General Assembly on 28 July 2010. Water related diseases are the primary cause of diseases and deaths around the world with more than 3.4 million deaths per year. Lack of monitoring of water sources and inability to anticipate the proliferation of waterborne diseases are found at the root of these deaths. There has been a compelling need for disease prediction based on water quality. The present study was focused on monitoring of water quality parameters and using these parameters to predict probable waterborne diseases. The main objective of study was to apply machine learning techniques to water quality data in order to make predictions about waterborne diseases. The work involved collecting observations of some of the water quality parameters by leveraging the Internet of Things (IOT). The detailed data, involving observations of all the necessary parameters, was collected from the West Bengal Pollution Control Board’s Water Quality Information System. Gradient Boosting Classifier was trained and tested on collected data. The accuracy of result was found to be 0.92 and 0.95 on cross-validation and hold-out data, respectively. Once trained, the model started making predictions based on primary data. The predicted diseases were conveyed in the form of alerts using Push bullet service. The study thus proposed usability of water quality parameters in early prediction of water related diseases.