Clustering And Classification Of High Speed Dimensional Data Stream In Dynamic Feature Selection

G.Senthil Velan , K. Somasundaram , V. Cyril Raj

G.Senthil Velan , K. Somasundaram , V. Cyril Raj

Abstract

Change can arise at the feature level and concept level within a data stream. At the changes in
feature level may arise as fresh which emerge in the stream based on additional features, or when a
feature's value and significance shifts as the stream developments. This change has not earned the
same coverage as reform at concept-level. In addition, many of the proposed approaches for
clustering streams depend on some type of distance and troublesome in high-dimensional data which
are similarity metric where the burden of dimensionality makes expanse capacities and any model of
“density” hard. In order to address the two problems we suggest as feature selection problem by
merging them and presenting the issue and precisely a problem of dynamic selection of features. In
this paper suggested a new approach to the clustering and classification of raw materials with high
dimensional (or close to raw) data streams which can be implement through stream clustering
algorithm and k-Nearest Neighbor (kNN) classifier. The proposed solution based on non-standard
distances, which are determined by hashing and compression approaches which increases clustering
efficiency and decreases the processing time based on proposed dynamic function mask needed by the
underlying algorithm. Hence, the evaluation of proposed method with various exist method to analyze
the maximum Variance method in order to select the best features.