Spam Identification in Social Media Using Agglomerative Clustering
Online Social networking sites, such as Twitter and Facebook, permit users to remain in contact with individuals. Additionally, they allow users to connect and create communities with one another. Online Social Networks (OSNs) are primarily used for various innocuous intention; they have become profitable aim which leads to cyber crimes and attacked by social bots due to their open essence, heavy users, and a rapid and often excessive increase in real-time messaging. It is expected that the rapid growth in global spam volume would undermine research work using social media data, challenging data integrity, driven by the detection and filtering of spam content in social media data. Researchers have proposed several approaches to address these problems. The Earlier systems consist of a crossover approach misusing network based highlights with metadata, content, and communication-based highlights to identify robotized spammers on Twitter. Spammers are commonly planted in OSNs. Various variants of spammers, ranging from traditional spammers to current issue spammers, are extensively studied in the legal challenges associated to the handling of spamming and found that such risks have dire implications for various internet parties. In this paper, unlike current approaches to characterizing spammers on the basis of their profiles. We have applied spam detection for a single user having multiple spams from different websites can be achieved by using hierarchical agglomerative clustering. A hierarchical clustering algorithm i.e, an agglomerative algorithm in which each element is clustered in its cluster. Until all the elements belong to one cluster, these clusters are combined iteratively. We have applied a set of elements where the distances are given as input between them. Accuracy, precision, recall, f1 score is calculated efficiently.