PARTITIONAL DISTANCE-BASED PROJECTED CLUSTERING ALGORITHM FOR HIGH DIMENSIONAL DATA

B.HARI BABU. Dr. N.Subhash Chandra, Dr. T.Venu Gopal

B.HARI BABU. Dr. N.Subhash Chandra, Dr. T.Venu Gopal

Abstract

High-dimensional data clustering is the analysis of clustered data with anywhere from a few tens to thousands of dimensions.In several application areas, data often performed in very high-dimensional measures; the dimension could be in the hundreds, thousands, or more.Clustering high-dimensional data was a great difficulty because of the lack of original details. Most clustering algorithms today become largely ineffective if the underlying population scale is calculated between data points in a fully dimensional space. For this purpose, several expected clustering algorithms have been proposed. But most of them face challenges when groups hide in sub-locations with very deep dimensions. These challenges spur our efforts to introduce a robust, Partitional Distance-Based Projected Clustering Algorithm (PDBPC). The proposed algorithm consists of three phases, the first stage examines the importance of traits by detecting the scattered and impenetrable regions and their location in each trait, and the second stage is the exclusion of extremes, while the final stage aims to define groups in different subspaces. The clustering method based on the K-means algorithm with distance calculation is limited to subsets of properties where the target materials are not porous. The proposed algorithm is concerned with specifying the expected low-dimensional combinations embedded in a high-dimensional area and avoids estimating the distance in full-dimensional space. The offer agreement was confirmed by experimental study using artificial and real data sets.The proposed algorithm is compared with existed CLIQUE algorithm and results shows that the proposed algorithm is provides better accuracy.