Misinformation and manipulative online content pose significant threats to public health and decision-making. The rise of AI-generated content and reduced reliance on fact-checking programs by platforms have exacerbated these challenges. Although fully automated detection of misinformation remains beyond current capabilities, researchers have developed effective methods for identifying linguistic features, such as emotionally manipulative language, that are associated with misleading content. However, little is known about how warning labels for manipulative content influence user perceptions and behavior. In an experimental survey of 945 Americans, we investigated the effects of manipulative content warning labels on responses to true and false health-related social media posts. Our findings highlight differences in user reactions to posts with various types of labels, with implications for mitigating the impact of manipulative content. We offer recommendations for the design of effective labeling systems and future research directions to enhance online content moderation strategies.
