Social Science Research Council Research AMP Just Tech
News Item

Reading Today’s Headlines Through AI: A Real-Time Audit of Six Commercial Chatbots

In a new study, scholars measured how accurately popular AI chatbots answered questions about the emerging news and found substantial regional disparity, dependence on distinct information ecosystems, and acute fragility under imperfect prompts.

About 10% of Americans now turn to AI chatbots for news at least sometimes—and among news consumers under 25 worldwide, this share gets closer to 15%. Yet trust is running ahead of reliability. About half of U.S. adults who get news this way reported encountering information they believed to be inaccurate, and about a third struggled to separate true claims from false. As AI quietly assumes the role that search engines once held, increasingly selective and increasingly trusted without a click-through to a source, a natural question emerges: How reliable and trustworthy are AI chatbots in answering questions about events unfolding each day?

In a new preprint study, we evaluated six commercial AI chatbots on 2,100 same-day news questions, yielding 12,600 model responses, across six regions and languages. We found that while many achieved over 90% accuracy on multiple-choice questions, the aggregate scores obscured three crucial patterns that bear directly on whether these systems can be trusted as news intermediaries: a regional accuracy disparity that concentrates on Hindi, citation profiles shaped by retrieval-and-synthesis engineering and legal considerations, and a sharp loss of robustness whenever a question’s premise is slightly off. 

Click here to read the full research summary on the Stanford HAI website.