In this article, we review our study of 13 493 bot-like Twitter accounts that tweeted during the UK European Union membership referendum debate and disappeared from the platform after the ballot. We discuss the methodological challenges and lessons learned from a study that emerged in a period of increasing weaponization of social media and mounting concerns about information warfare. We address the challenges and shortcomings involved in bot detection, the extent to which disinformation campaigns on social media are effective, valid metrics for user exposure, activation and engagement in the context of disinformation campaigns, unsupervised and supervised posting protocols, along with infrastructure and ethical issues associated with social sciences research based on large-scale social media data. We argue for improving researchers' access to data associated with contentious issues and suggest that social media platforms should offer public application programming interfaces to allow researchers access to content generated on their networks. We conclude with reflections on the relevance of this research agenda to public policy.This article is part of a discussion meeting issue ‘The growing ubiquity of algorithms in society: implications, impacts and innovations'.