Log Data Recovery using Log Clustering Approach

M. Eliazer, Subhojit Sarkar, Nisha Prasad

M. Eliazer, Subhojit Sarkar, Nisha Prasad

Abstract

Data has been so important in today’s world that it’s now the most attacked commodity in our society. There has been so much production of data every single moment that we need much larger storage devices to store this data. Hacking and corrupting the information stored by various systems is one of the trends by the organizations. Any number of attempts made on a system to fully secure it and its data is unsuccessful at some point. Therefore, an innovative solution to this problem is by using some techniques to re-generate this data with minimum possible deviation from the original data. Log data are the text files which are created by any computing system to maintain a record of the operations it performed in certain duration. The log files contain all the crucial information regarding the locations and file systems. The paper aims to provide an innovative solution to the corrupted log data by providing a recovery algorithm without any generation of process model. A log repairing approach using a k-means of clustering algorithm to store the clusters in hard disk which can be later on used to predict the corrupted logs at least possible approximation and deviation from the original data.

Keywords: Log data, missing log, k-means clustering, log repairing, HashMap.