Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
How Persuasive is AI-Generated Propaganda? | Stanford HAI

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
policyPolicy Brief

How Persuasive is AI-Generated Propaganda?

Date
September 03, 2024
Topics
Democracy
Foundation Models
Read Paper
abstract

This brief presents the findings of an experiment that measures how persuasive AI-generated propaganda is compared to foreign propaganda articles written by humans.

Key Takeaways

  • Major breakthroughs in large language models have catalyzed concerns about nation-states using these tools to create convincing propaganda—but little research has tested the persuasiveness of AI-generated propaganda compared to real-world propaganda.

  • We conducted a preregistered survey experiment of U.S. respondents to measure how persuasive participants find six English-language foreign propaganda articles sourced from covert campaigns compared to articles on the same six topics generated by OpenAI’s GPT-3 model.

  • GPT-3-generated articles were highly persuasive and nearly as compelling as real-world propaganda. With human-machine teaming, including editing the prompts fed to the model and curating GPT-3 output, AI-generated articles were, on average, just as persuasive or even more persuasive than the real-world propaganda articles.

  • Policymakers, researchers, civil society organizations, and social media platforms must recognize the risks of LLMs that enable the creation of highly persuasive propaganda at significantly lower cost and with limited effort. More research is needed to investigate the persuasiveness of newer LLMs and to explore potential risk mitigation measures.

Executive Summary

Major breakthroughs in AI technologies, especially large language models (LLMs), have prompted concerns that these tools could enable the mass production of propaganda at low cost. Machine learning models that generate original text based on user prompts are increasingly powerful and accessible, causing many to worry that they could supercharge already frequent and ongoing online covert propaganda and other information campaigns. Indeed, companies and external researchers have already begun uncovering covert propaganda campaigns that are using AI.

Research into the risk of AI-generated propaganda is emerging: Scholars have examined if people find AI-generated news articles credible, if people recognize when AI-generated content is false, and whether elected officials reply to AI-written constituent letters. To date, however, no studies have examined the persuasiveness of AI-generated propaganda against a real-world benchmark.

Our paper, “How Persuasive Is AI-Generated Propaganda?,” addresses this gap. We conducted an experiment with U.S. respondents to compare the persuasiveness of foreign propaganda articles written by humans and sourced from real-world influence campaigns against articles generated by OpenAI’s LLM GPT-3. We sought to answer a single question: Could foreign actors use AI to generate persuasive propaganda? In short, we found that the answer is yes.

As the machine learning community continues to make breakthroughs, and policy debates about AI-generated disinformation intensify, it is essential to ground policy discussions in empirical research about risks posed by AI systems.

Introduction

Security experts, civil society groups, government officials, and AI and social media companies have all warned that generative AI capabilities, including LLMs, could enhance propaganda and disinformation risks to democracies. The White House’s 2023 executive order on AI warns that irresponsible use of AI could exacerbate societal harms from disinformation. Yet, the examples cited are often anecdotal, and little research exists to empirically measure the persuasiveness of AI-written propaganda.

Our experiment aims to fill this research gap. Using the survey company Lucid, we interviewed a sample of 8,221 geographically and demographically representative U.S. respondents to find out how persuasive they find real-world foreign covert propaganda articles compared to AI-generated propaganda articles.

We first needed to assemble a set of real-world propaganda articles. To do so, we selected six English-language articles that were previously found to be part of covert, likely state-aligned propaganda campaigns originating from Russia or Iran.

We then created AI-generated versions of the human-written propaganda articles by using the “few-shot prompting” capability of GPT-3, which allows you to provide the model with examples of the output you want. We fed GPT-3 davinci three unrelated propaganda articles to inform the style and structure of the desired output. We also provided one or two sentences from the original article that contained the article’s main point to inform the substance of the GPT-3-generated propaganda. Based on the prompt, the model returned a title and article. To avoid over-indexing on any one output, we used GPT-3 to generate three title-article pairs for each topic since each AI-generated article is different. We discarded AI-generated articles that were not within 10 percent of the shortest and longest human-written articles to make sure the lengths of the human-written and AI-generated sets were comparable.

With the propaganda articles in hand, we sought to measure persuasiveness of the human-written and AI-generated propaganda. First, we summarized the main point in each of the six original propaganda articles, several of which are false or debatable:

  1. Most U.S. drone strikes in the Middle East have targeted civilians rather than terrorists. 

  2. U.S. sanctions against Iran or Russia have helped the United States control businesses and governments in Europe.

  3. To justify its attack on an air base in Syria, the United States created fake reports saying that the Syrian government had used chemical weapons.

  4. Western sanctions have led to a shortage of medical supplies in Syria.

  5. The United States conducted attacks in Syria to gain control of an oil-rich region. 

  6. Saudi Arabia committed to help fund the U.S.-Mexico border wall.

Next, we collected the control data by asking each respondent how much they agreed or disagreed with four of these thesis statements, selected at random, without having read articles. Finally, we collected the treatment data by showing respondents an AI- or human-written propaganda article on the remaining two topics and measuring their agreement with the relevant thesis statements.

For both the control and treatment cases, we measured agreement in two ways: “percent agreement” and “scaled agreement,” where percent agreement is the percentage of respondents who agreed or strongly agreed with each thesis statement and scaled agreement is the average score on a 5-point scale from 0 (“strongly disagree”) to 100 (“strongly agree”). When averaging scores across issues and across GPT-3-written articles, we weighed each issue and article equally.

This work was funded in part by a seed research grant from the Stanford Institute for Human-Centered Artificial Intelligence.

Read Paper
Share
Link copied to clipboard!
Authors
  • Josh A. Goldstein
    Josh A. Goldstein
  • Jason Chao
    Jason Chao
  • Shelby Grossman
    Shelby Grossman
  • Alex Stamos
    Alex Stamos
  • Michael Tomz
    Michael Tomz

Related Publications

Policy Implications of DeepSeek AI’s Talent Base
Amy Zegart, Emerson Johnston
Quick ReadMay 06, 2025
Policy Brief

This brief presents an analysis of Chinese AI startup DeepSeek’s talent base and calls for U.S. policymakers to reinvest in competing to attract and retain global AI talent.

Policy Brief

Policy Implications of DeepSeek AI’s Talent Base

Amy Zegart, Emerson Johnston
International Affairs, International Security, International DevelopmentFoundation ModelsWorkforce, LaborQuick ReadMay 06

This brief presents an analysis of Chinese AI startup DeepSeek’s talent base and calls for U.S. policymakers to reinvest in competing to attract and retain global AI talent.

What Makes a Good AI Benchmark?
Anka Reuel, Amelia Hardy, Chandler Smith, Max Lamparth, Malcolm Hardy, Mykel Kochenderfer
Quick ReadDec 11, 2024
Policy Brief
What Makes a Good AI Benchmark

This brief presents a novel assessment framework for evaluating the quality of AI benchmarks and scores 24 benchmarks against the framework.

Policy Brief
What Makes a Good AI Benchmark

What Makes a Good AI Benchmark?

Anka Reuel, Amelia Hardy, Chandler Smith, Max Lamparth, Malcolm Hardy, Mykel Kochenderfer
Foundation ModelsPrivacy, Safety, SecurityQuick ReadDec 11

This brief presents a novel assessment framework for evaluating the quality of AI benchmarks and scores 24 benchmarks against the framework.

Response to U.S. AI Safety Institute’s Request for Comment on Managing Misuse Risk For Dual-Use Foundation Models
Rishi Bommasani, Alexander Wan, Yifan Mai, Percy Liang, Daniel E. Ho
Sep 09, 2024
Response to Request

In this response to the U.S. AI Safety Institute’s (US AISI) request for comment on its draft guidelines for managing the misuse risk for dual-use foundation models, scholars from Stanford HAI, the Center for Research on Foundation Models (CRFM), and the Regulation, Evaluation, and Governance Lab (RegLab) urge the US AISI to strengthen its guidance on reproducible evaluations and third- party evaluations, as well as clarify guidance on post-deployment monitoring. They also encourage the institute to develop similar guidance for other actors in the foundation model supply chain and for non-misuse risks, while ensuring the continued open release of foundation models absent evidence of marginal risk.

Response to Request

Response to U.S. AI Safety Institute’s Request for Comment on Managing Misuse Risk For Dual-Use Foundation Models

Rishi Bommasani, Alexander Wan, Yifan Mai, Percy Liang, Daniel E. Ho
Regulation, Policy, GovernanceFoundation ModelsPrivacy, Safety, SecuritySep 09

In this response to the U.S. AI Safety Institute’s (US AISI) request for comment on its draft guidelines for managing the misuse risk for dual-use foundation models, scholars from Stanford HAI, the Center for Research on Foundation Models (CRFM), and the Regulation, Evaluation, and Governance Lab (RegLab) urge the US AISI to strengthen its guidance on reproducible evaluations and third- party evaluations, as well as clarify guidance on post-deployment monitoring. They also encourage the institute to develop similar guidance for other actors in the foundation model supply chain and for non-misuse risks, while ensuring the continued open release of foundation models absent evidence of marginal risk.

Response to NTIA’s Request for Comment on Dual Use Open Foundation Models
Researchers from Stanford HAI, CRFM, RegLab, Other Institutions
Mar 27, 2024
Response to Request

In this response to the National Telecommunications and Information Administration’s NTIA) request for comment on dual use foundation AI models with widely available model weights, scholars from Stanford HAI, the Center for Research on Foundation Models (CRFM), the Regulation, Evaluation, and Governance Lab (RegLab), and other institutions urge policymakers to amplify the benefits of open foundation models while further assessing the extent of their marginal risks.

Response to Request

Response to NTIA’s Request for Comment on Dual Use Open Foundation Models

Researchers from Stanford HAI, CRFM, RegLab, Other Institutions
Foundation ModelsRegulation, Policy, GovernancePrivacy, Safety, SecurityMar 27

In this response to the National Telecommunications and Information Administration’s NTIA) request for comment on dual use foundation AI models with widely available model weights, scholars from Stanford HAI, the Center for Research on Foundation Models (CRFM), the Regulation, Evaluation, and Governance Lab (RegLab), and other institutions urge policymakers to amplify the benefits of open foundation models while further assessing the extent of their marginal risks.