Feature Selection for Unsupervised Bot Detection

Khalil, Hunia; Khan, Muhammad U. S; Ali, Mazhar

In recent years, botnet becomes one of the biggest challenges that online social networks are facing. Due to the sudden increase in the botnet, the detection of bots become very crucial. Twitter is one of the most used platforms, targeted by the bots. In this research, we use clustering methods for the bot detection. In clustering, selection, of features are very challenging because some features are important for clustering, while others may hinder the clustering task. In our paper, we focus on the selection of features that distinguish bot users. After features selection, we use two clustering algorithms dbscan and k-mean for bots detection. We conclude that dbscan performs better on followers-count, friends-count, favorite-count, listed-count, retweet-count, reply-count, hashtag-count, mentions-count, by achieving 97.7% accuracy, 91% precision, 98% recall, and 94%, f-measure.