Author order: Waight, Yang, Yuan, Messing, Roberts, Stewart, Tucker. Bias in large language models is particularly concerning because of the capacity of generative AI to translate these biases into seemingly objective text (Benjamin, 2020). This project considers a special case of AI bias: propaganda bias. Governments with control over the information environment are able to disseminate propagandist texts through print and digital news, which then can potentially be utilized as training data for LLMs (Yang and Roberts, 2023). These models can then potentially reproduce propaganda in their outputs.
We design our study based on Metaxa et al. (2022) and Metaxa et al. (2024), prompting a sociotechnical system with queries that vary along the dimension of interest and having human coders evaluate the results. We conduct an audit of GPT 3.5 chat, a commonly used generative AI model, by comparing the completions returned by GPT 3.5 when prompted in Chinese vs. English. We expect that GPT completions are more likely to be favorable to mainland China institutions, leaders, and systems when we prompt in Chinese rather than English. We expect this because we anticipate that large language models like GPT and Llama will draw more on their Chinese language training texts when prompted in Chinese about China-related topics. These training texts will be more likely to have their contents affected by propaganda interventions from the Chinese propaganda apparatus than English language texts. We do not expect the same degree of learning to occur from propaganda affected Chinese language texts for completions from English prompts or for completions from Chinese or English prompts about non-China related topics.
