Comparative Analysis of Frequent Pattern Mining Algorithm on Water Quality Data

Ms. P Mahalakshmi, Ansh Kaul, Yash Aggarwal

Ms. P Mahalakshmi, Ansh Kaul, Yash Aggarwal

Abstract

Every day humans are generating very large volumes of data. In fact 2.5 Exabytes(10¹⁸) are generated every day.Therefore many data mining methods were proposed in order to achieve better execution and time complexity for such humongous volumes of data. One such type of algorithms consists of extracting the most frequent going on patterns in the transactional databases. Such algorithms are called association mining algorithms. Dependency in transactions between the time and location in addition increases the complexity of the frequent object set mining task. The proposed work aims to recognize and extract the common styles from such transactional information. The dependency of water data on various factors is used to discover regularly co-occurring pollution over numerous water bodies, in different states of India. These consist of Nitrates and Coliform. Other factors which influence the quality of water are pH,Temperature,Biochemical Oxygen Demand (BOD) and Dissolved Oxygen(DO).Various strategies have been proposed to mine frequently occurring patterns quickly and accurately. However our work promotes a general hash based methodology which may be carried out to any numerical information, which also includes water quality data. This method is called hash based apriori algorithm [1]. This is a modification of the traditional apriori algorithm. Also, an assessment with respect to the FP growth algorithm is shown in terms of execution time.

Keywords: Water quality, association mining, hash based apriori algorithm, Frequent pattern growth algorithm.