How 13 Words Can Manipulate AI Research Agents

TL;DR Summary:

Poisoned Reddit Pages: A single 13-word edit on Reddit can steer deep-research AI agents to cite and recommend fake products or entities in their reports.

WARP Attack Vulnerability: The Web Agent Retrieval Poisoning attack works by altering user-generated content without needing access to the AI model, search engine, or prompts.

Existing Defenses Fail: Current text filters and report-level checks cannot reliably detect fluent, AI-written injected passages, allowing misinformation to appear alongside legitimate sources.

Can a 13-word edit manipulate what AI research agents recommend?

Cornell Tech researchers found that deep-research AI agents are vulnerable to a simple attack. A single edited comment on Reddit or a forum post can steer these AI systems to cite and recommend fake products, services, or entities in their reports.

The researchers called these altered pages "poisoned" because the added text was designed to control what the AI system cited and repeated. They named the attack WARP, which stands for Web Agent Retrieval Poisoning.

How web agent retrieval poisoning works

The attack doesn't require access to the AI model, prompts, search engine, or retrieval system. An attacker edits or adds text to a page the agent already tends to pull in. This includes Reddit threads, Wikipedia pages, or forum posts.

When the agent searches related topics later, it may retrieve that page, cite it, and repeat the attacker's message. Deep-research tools often run many related searches for one user request. The study found the same user-generated pages surfaced across related queries.

Reddit creates the biggest vulnerability to web agent retrieval poisoning

Across three AI systems tested—STORM, Co-STORM, and OmniThink—17% to 23% of retrieved URLs came from user-generated platforms. These included Reddit, YouTube, Facebook, and Wikipedia.

Reddit made up the largest share. It accounted for 54% to 71% of user-generated URLs retrieved by the three open-source systems.

The researchers didn't alter live websites. They used a simulation framework called GeoStorm to insert manipulated text into retrieved content during testing.

Short injected passages work

The researchers found the attack worked with snippets as short as about 13 words.

In one test, a 15-word sentence pushed a fake cryptocurrency called BananaCoin into a Co-STORM report as an "emerging" long-term investment option. The report cited the altered source alongside legitimate crypto sources.

When the manipulated page was retrieved, the fake entity appeared in 38% to 51% of reports across systems. Targeting multiple pages raised that range to 42% to 62%.

The attack still worked when systems retrieved full Reddit threads. When injected text was added to complete Reddit threads and made up less than 4% of the retrieved content, the fake entity still appeared in 30% to 53% of reports when the page was retrieved.

Existing defenses fail against web agent retrieval poisoning

Blocking user-generated domains stopped this attack path. But it also removed sources such as firsthand product experiences and local recommendations.

The tested text filters failed to reliably separate injected passages from normal user content. The manipulated passages were fluent because an AI model wrote them. Perplexity-based filters were more likely to flag normal user content than the injected text.

Report-level checks also missed the manipulation. Altered reports looked similar to clean reports because the agent itself folded the fake recommendation into an otherwise normal answer.

Why this matters for your brand

A small edit to a public page becomes part of a cited AI answer. The underlying source is user-generated, which makes it harder to verify. Misinformation planted on sites like Reddit or in forums moves from discussion threads to cited recommendations in AI answers that look credible to users.

Your brand appears in AI reports alongside sources you didn't choose. When poisoned pages influence what AI agents retrieve and cite, your products get mentioned in the same context as fake alternatives, misleading information, or competitors using manipulation tactics.

Monitoring citation patterns in AI research reports

Organizations and brands concerned about citation manipulation in AI reports need tools to track when and where their names appear alongside injected or misleading content.

AI Mentions tracks how AI systems cite your brand, products, or competitors across research agent outputs. This helps teams spot when unexpected sources or user-generated content influences AI recommendations.

The tool provides alerts when citation patterns shift or when new sources begin appearing in AI-generated reports about your category.

For brands, this creates an early-warning system to detect when poisoned pages may be steering AI agents to mention competitors, fake alternatives, or misinformation in the same context as legitimate products.

Research details

The paper, "Deep-Research Agents Can Be Poisoned via User-Generated Content," was written by Tingwei Zhang, Harold Triedman, and Vitaly Shmatikov of Cornell Tech. It was posted to arXiv on May 22, 2026.

The researchers tested the full attack on three open-source systems: STORM, Co-STORM, and OmniThink. They analyzed OpenAI Deep Research and Gemini Deep Research for user-generated citations but didn't run live manipulation tests because that would require publishing altered content to the open web.

The study shows that even brief edits to public pages change how AI research agents cite and recommend information. When 13 words on a Reddit thread can push a fake entity into 38% of AI reports, you need visibility into which sources influence AI citations about your brand. AI Mentions identifies when citation patterns shift and which competitor messaging AI tools have absorbed, giving you the data to respond before manipulated recommendations become widespread. You can explore how AI Mentions tracks these citation patterns and alerts you to unexpected sources appearing in AI-generated reports.

Search FSAS

How 13 Words Can Manipulate AI Research Agents

TL;DR Summary:

Can a 13-word edit manipulate what AI research agents recommend?

How web agent retrieval poisoning works

Reddit creates the biggest vulnerability to web agent retrieval poisoning

Short injected passages work

Existing defenses fail against web agent retrieval poisoning

Why this matters for your brand

Monitoring citation patterns in AI research reports

Research details

Recent Articles

Our Tools & Partners:

FREE SEO AUDIT SERVICES

EMAIL

PHONE NUMBER

ADDRESS

Follow Us:

Company

Sections

Featured

Recent Articles

Search Has Evolved. Have You?