Afternoon BriefAI Search & Discovery

Which Publications Get Cited by AI Search Engines in 2026

A large-scale study of 366,000 AI search citations reveals which publications actually get cited by ChatGPT, Perplexity, and Gemini — and why your owned content rarely makes the cut.

Christian Lehman|May 10, 2026

Which Publications Get Cited by AI Search Engines in 2026

AI search engines cite from a concentrated list of trusted publications — and the data shows exactly which ones. A study of 366,087 citations from 12 AI search models across OpenAI, Perplexity, and Google found that citations cluster heavily around a small number of outlets: the top 20 news sources account for 67.3% of all OpenAI citations, with patterns that differ sharply by provider. If your brand needs AI visibility, these are the publications that actually move the needle.

Which Publications Each AI Engine Cites Most

Citation preferences diverge significantly across providers, according to a 2025 analysis of the AI Search Arena dataset that examined real-world citations from ChatGPT, Perplexity, and Gemini (Yang et al., arXiv:2507.05301).

Publication	ChatGPT (OpenAI)	Perplexity	Gemini (Google)
Reuters	22.8%	—	—
AP News	12.2%	—	—
Financial Times	7.0%	—	—
BBC	—	3.2%	—
Yahoo News	—	2.7%	—
Forbes	—	—	2.5%

The pattern here is not arbitrary. OpenAI concentrates on primary wire services with documented editorial authority. Perplexity favors BBC, which has broad global coverage and high citation frequency. Gemini surfaces Forbes — the only traditional business publication that appears consistently across all 11 major B2B and B2C sectors studied. These aren't just popular outlets. They are outlets that AI systems have learned to treat as reliable extraction targets.

Why Concentration Is So High

Across all three providers, a handful of outlets account for the majority of citations. This concentration is structural, not coincidental.

AI search systems prefer sources with three characteristics: established domain credibility, broad topic coverage, and consistent editorial standards. Wire services like Reuters and AP dominate because they publish structured, factual content at scale across every news category. That depth of indexed content means AI engines can almost always find a relevant article from these sources on any query.

The same study found that 96.2% of OpenAI's cited sources were rated as high-quality outlets, versus 92.2% for Google and 89.7% for Perplexity. Low-credibility sources are rarely cited — not because they're actively excluded, but because high-authority sources are so abundant that low-credibility alternatives don't compete.

Why Your Own Site Almost Never Gets Cited

One of the clearest findings from recent GEO research: AI engines show an overwhelming preference for earned media over brand-owned content. A study across multiple AI search systems found that social content was "almost absent" from AI answers, and owned brand pages were systematically underweighted in favor of third-party publisher domains (arXiv:2509.08919).

Claude and ChatGPT pulled 93%+ of their citations from earned media sources across consumer electronics and automotive categories. Brand pages appeared only in the low single digits. That's not a fringe finding — it's consistent across verticals and engine families.

This is the core execution implication: building great content on your domain is necessary but insufficient for AI citation. The publications listed above serve as authority relays. Your brand gets cited because Reuters, Forbes, or BBC covered it and structured it in a way AI engines can extract. As Jaxon Parrott argued in Entrepreneur, PR now has to work for machines, not just journalists and buyers — and earning coverage in these specific publications is a large part of how that works in practice.

Getting Cited vs. Getting Absorbed

There's an important distinction operators miss: getting cited by an AI engine is not the same as having your content absorbed into the answer.

A 2025 measurement study across ChatGPT, Google AI Overview, and Perplexity (arXiv:2604.25707) found a sharp divergence between how many sources engines cite and how deeply they use them:

ChatGPT cites fewer sources but uses them more deeply — content that gets selected carries significantly more influence over the generated answer
Perplexity cites the most sources per prompt (often 10+) but absorbs them more shallowly — being cited doesn't guarantee your content shapes the answer
Gemini falls between the two

Getting a Forbes placement cited is step one. Getting that Forbes placement to actually influence AI answers requires structural choices in the content itself.

Pages that get deeply absorbed share a specific set of features: they are longer, more modular, semantically aligned with the query, and contain extractable evidence types — definitions, numerical facts, comparisons, and procedural steps. The study found that Q&A formatting alone does not improve absorption. Structure without evidence density doesn't move the metric.

What Determines Whether a Publication Gets Cited At All

The GEO-16 framework, tested across 1,702 citations from Brave, Google AI Overview, and Perplexity, identified the on-page signals most strongly associated with citation selection (arXiv:2509.10762):

Top citation predictors:

Metadata & Freshness — recency signals and properly structured metadata
Semantic HTML — correct use of heading hierarchy, structured markup
Structured Data — schema markup, comparison tables, definition lists

Pages with a GEO quality score ≥ 0.70 and ≥ 12 quality pillar hits achieved a 78% cross-engine citation rate. The odds ratio for being cited when hitting that threshold versus not: 4.2. That's a statistically strong relationship.

The publications that dominate AI citations — Reuters, AP, BBC, Forbes — consistently produce content that passes these signals. They've been doing it for years without explicitly optimizing for AI. The question for operators is whether your coverage in those publications is structured the same way, or whether you're getting a mention that doesn't pass the extraction threshold.

The Execution Gap

One finding that should change how teams brief PR: Perplexity visits approximately 10 relevant pages per query but cites only 3 (arXiv:2508.00838). ChatGPT shows similar selectivity. Being on the right publication is table stakes. Being the most extractable page on that publication determines whether you get cited or not.

For teams running earned media as an AI visibility strategy, this creates a clear operating standard:

Target the high-concentration publications — Reuters, AP News, FT, Forbes, BBC, NYT. Outlets that AI engines have already classified as reliable extraction targets
Structure your quotes and brand mentions for extraction — entity name in full, category association, claim backed by data. Not "the company said" but "[Brand] provides [category], and their research shows [specific claim with number]"
Push for pieces with comparison data, definitions, and numbered findings — these content types have higher absorption rates than narrative-only coverage
Verify citation lift, not traffic — most AI citation value is zero-click. Measure whether your brand appears in AI answers for your target queries before and after a placement lands

For a breakdown of which publications get cited by AI engines within specific verticals — enterprise tech, fintech, healthcare, and SaaS — AuthorityTech tracks publication-level citation patterns by sector through ongoing AI visibility monitoring.

The list of publications AI engines actually trust is shorter than most PR strategies assume. Working it precisely is the difference between a media badge and source infrastructure.