The Exact Content Structure Changes That Lift AI Citations 17% (Without Rewriting a Word)
New research tested content structure changes across six AI engines and found a 17.3% citation lift from formatting alone. Here's the three-level audit your team can run this week.
A peer-reviewed study just quantified what most GEO practitioners treat as intuition: your page's structure, independent of what it says, determines whether AI engines cite it. The research found a 17.3% citation improvement from structural changes alone across six generative engines, with an 18.5% gain in perceived content quality. The fix is formatting, not rewriting. Christian Lehman breaks down the three structural levels and what to change first.
Most teams working on AI visibility focus on what their content says. Better statistics, stronger sources, sharper arguments. That matters. But a new paper from researchers at the University of Tokyo and University of Tsukuba just isolated a variable almost nobody is optimizing: the physical structure of the page itself.
The study that separated structure from content
The paper is called "Structural Feature Engineering for Generative Engine Optimization" by Junwei Yu, Yang MuFeng, Yepeng Ding, and Hiroyuki Sato (arXiv:2603.29979, March 2026). They introduced GEO-SFE, a framework that decomposes content structure into three hierarchical levels and measures how each level independently affects AI citation behavior.
The method is what makes this useful. They kept the semantic content identical and changed only the structural features. Same words, same claims, same sources. Different formatting, different hierarchy, different chunking. Then they tested those structural variants across six generative engines and measured citation rates.
The result: a consistent 17.3% improvement in citation rates from structural changes alone, with subjective assessments showing 18.5% average enhancement in perceived content quality. The actual information on the page did not change. (Yu et al., 2026)
That finding has a specific implication for B2B marketing teams: if you've already invested in quality content but your citation rates are flat, the next highest-leverage move is probably structural, not editorial.
Three levels of structure that predict citation
GEO-SFE breaks page structure into three hierarchical levels. Christian Lehman's read: each level maps to a different team and a different sprint.
| Level | What it covers | What breaks citation | Who fixes it |
|---|---|---|---|
| Macro-structure | Document architecture: heading hierarchy, section sequence, information flow | Flat heading trees, missing H2/H3 nesting, sections in illogical order | Content strategist + CMS template owner |
| Meso-structure | Information chunking: paragraph length, list usage, table placement | Long prose blocks with no breakpoints, buried data in paragraphs, missing comparison tables | Content editor + SEO lead |
| Micro-structure | Visual emphasis: bold/italic usage, inline definitions, callout placement | Overformatted pages (everything bold), no emphasis markers on key claims, missing inline definitions | Content editor |
The paper's key methodological contribution: they developed architecture-agnostic optimization algorithms with semantic preservation constraints. The structural changes were designed to improve citation without altering meaning. This controls for the variable most GEO research conflates. When you rewrite a paragraph to add a statistic and also restructure the heading, you cannot tell which change earned the citation. GEO-SFE isolated the structure variable.
Previous GEO research, including the Princeton/Georgia Tech study (Aggarwal et al., arXiv:2311.09735), established that content enrichment strategies like adding statistics and citing authoritative sources improve AI engine visibility. GEO-SFE proves there is a second, independent lever: even without changing what the page says, how you organize it moves citation rates by 17.3%.
The compound effect matters. A page that hits both levers — strong evidence base plus clean structural formatting — stacks content quality gains on top of a 17% structural lift. Those gains compound because AI engines evaluate content and structure in separate pipeline stages. Fix one without the other and you leave citations on the table.
The macro-structure audit: heading hierarchy
This is the highest-leverage fix because it affects how AI engines parse your entire page.
The GEO-16 framework (Kumar et al., 2025) established that Semantic HTML correlates at r=0.65 with citation rates across Brave, Google AIO, and Perplexity. GEO-SFE adds the mechanism: AI engines use heading structure to build an internal representation of what each section covers. When headings are flat (all H2, no H3s) or non-descriptive (marketing-speak instead of query-matching language), the engine cannot reliably extract a section-level answer.
What to check on your top 10 pages:
- Does every H2 contain a target keyword or a question a buyer would type?
- Is there at least one H3 under each H2?
- Do the headings, read in sequence, tell a coherent story without the body text?
If the answer to any of those is no, you have a macro-structure problem that no amount of better writing will fix.
The meso-structure audit: chunking for extraction
AI engines extract at the passage level. A 400-word paragraph with three data points, two claims, and a conclusion gives the engine no clean extraction boundary. It might still use the information, but it is less likely to cite the specific page because it cannot attribute a discrete passage.
The GEO-SFE data showed that proper chunking — shorter paragraphs with one claim per block, data in tables rather than inline, and comparison grids for multi-option analysis — produced the strongest meso-level citation improvements across all six engines tested.
Christian Lehman's practical checklist:
- No paragraph longer than 100 words when it contains a factual claim
- Any comparison of three or more items goes in a table, not a list
- Key statistics get their own sentence, not buried mid-paragraph
- Step-by-step processes use numbered lists, not prose
The Moz 2026 analysis of 40,000 Google AI Mode queries found that 88% of AI Mode citations come from pages not in the organic top 10. The pages earning those citations are winning on extractability, not traditional ranking signals. Chunking is extractability.
The micro-structure audit: emphasis that guides citation
Micro-structure is the least intuitive of the three, but the GEO-SFE data was clear: strategic use of bold text, inline definitions, and callout formatting changes citation behavior. Not because the engine reads bold as "important" in a simplistic way, but because emphasis markers help the engine identify which specific claim within a section is the primary assertion.
The failure mode is overformatting. When everything on the page is bold, nothing is. The engine loses the signal.
The rule: bold exactly one sentence per H2 section, the sentence that contains the primary claim you want cited. If you cannot identify a single sentence worth bolding, the section does not have a clear enough claim to earn a citation.
This pattern aligns with what Ahrefs found in their ChatGPT citation analysis: 67% of ChatGPT's top citations go to original research and first-hand data. When the original data point is structurally emphasized and isolated in its own chunk, it is more citable. When it is embedded in a long paragraph with no formatting differentiation, it competes for attention with every other sentence around it.
The sequencing that gets this done in two weeks
Most teams treat structural optimization as a vague backlog item. GEO-SFE gives you a concrete priority order.
Week 1: Macro + meso on your top 5 pages
Pull your top five pages by search impressions. For each:
- Restructure heading hierarchy to match buyer queries
- Break long paragraphs into single-claim chunks
- Convert inline comparisons to tables
- Add FAQ section if missing (2-3 questions, phrased as search queries)
Week 2: Micro-structure pass + validation
- Bold the primary claim in each H2 section
- Add inline definitions for technical terms on first use
- Strip overformatting — remove bold from any sentence that is not a primary claim
- Run the pages through your AI citation architecture monitoring to measure baseline
This is a content ops sprint, not a content creation sprint. You are reformatting existing assets, not producing new ones. The GEO-SFE result — 17.3% improvement — came without changing a single word of content.
Why this compounds through earned media
Structural optimization matters more when your content lives on third-party domains. The Muck Rack Generative Pulse study found that earned media accounts for the vast majority of AI citations, with 95% coming from non-paid sources. When an earned placement in a trusted publication is structurally clean — proper headings, chunked claims, one bold key finding — the AI engine can extract and cite it more reliably than a placement with the same information buried in prose.
This is where Machine Relations explains the infrastructure, as Jaxon Parrott has written about extensively. Earned media creates the domain trust AI engines require. Structural optimization creates the extractability those engines need to actually cite the placement. The two work as a system. Placement without structure means the engine trusts the domain but cannot extract a clean citation. Structure without placement means the engine can extract perfectly from a domain it does not trust enough to cite. Running both tracks is what the GEO-SFE data, combined with the earned media citation research from MR, makes non-optional.
FAQ
Does content structure affect AI citations independently of content quality? Yes. The GEO-SFE study held semantic content constant and changed only structural features, measuring a 17.3% citation improvement across six generative engines. Structure and content are independent citation levers. (Yu et al., 2026)
Which structural level has the highest impact on AI citations? Macro-structure (heading hierarchy and document architecture) had the broadest cross-engine effect because it determines how AI engines parse and represent the entire page. Meso-structure (chunking) had the strongest effect on passage-level extraction specifically.
How long does a structural content audit take? For a team of two (content editor + dev), a structural audit and fix of 5-10 priority pages takes roughly two weeks. No new content creation required. The changes are formatting, not editorial.
If you want to know whether your existing content has structural gaps suppressing AI citations, AuthorityTech's visibility audit checks both on-page structure and earned media presence across the engines where your buyers research.