DeBot: Twitter Bot Detection via Warped Correlation

Chavoshi, Nikan; Hamooni, Hossein; Mueen, Abdullah

We develop a warped correlation finder to identify correlated user accounts in social media websites such as Twitter. The key observation is that humans cannot be highly synchronous for a long duration, thus, highly synchronous user accounts are most likely bots. Existing bot detection methods are mostly supervised, which requires a large amount of labeled data to train, and do not consider cross-user features. In contrast, our bot detection system works on activity correlation without requiring labeled data. We develop a novel lag-sensitive hashing technique to cluster user accounts into correlated sets in near real-time. Our method, named DeBot, detects thousands of bots per day with a 94% precision and generates reports online everyday. In September 2016, DeBot has accumulated about 544,868 unique bots in the previous one year. We compare our detection technique with per-user techniques and with Twitter’s suspension system. We observe that some bots can avoid Twitter’s suspension mechanism and remain active for months, and, more alarmingly, we show that DeBot detects bots at a rate higher than the rate Twitter is suspending them.