This article explores the potential and limitations of Large Language Models (LLMs) in conducting qualitative thematic analysis (TA). By comparing GPT-4’s analysis with human-led analysis of a YouTube dataset on Roma migrants in Sweden, we examine both the computational efficiency and the interpretative constraints of AI-driven analysis. While GPT-4 demonstrates scalability in processing large datasets, its reliance on broad, neutral classifications overlooks the culturally embedded and ideologically charged dimensions of discourse. Drawing on perspectives that emphasize the situated nature of communication, we argue that human-AI synergy offers valuable methodological advancements but requires critical human oversight to ensure contextual depth and analytical precision. Our study underscores the need for refining LLM performance through culturally and politically informed training and theory-driven prompting to enhance their role in qualitative research.