AI Visibility

How to monitor what AI says about your brand

Most companies discover AI visibility gaps only after losing pipeline to competitors who appear in ChatGPT and Perplexity answers. Here is the systematic monitoring framework that tells you where you stand before it costs you deals.

Jaxon ParrottMar 4, 2026

Most founders find out their company is invisible to AI engines the same way: a prospect tells them. Sometimes it is direct. "ChatGPT didn't mention you when I asked about options in your category." Sometimes it is indirect. A deal closes slower than expected, or doesn't close at all, because the buyer did their AI-powered research before the first call and your name never came up.

This is the core problem with how most companies think about AI brand visibility: they treat it as a problem to solve after they notice it, not a condition to monitor continuously. The gap between "AI engines started representing your brand differently" and "you found out" is where pipeline bleeds.

Only 16 percent of brands today systematically track AI search performance, according to a McKinsey survey of 1,927 consumers and brand decision-makers. A separate analysis by Search Engine Land found that only 22 percent of marketers have set up any form of LLM brand visibility or traffic monitoring. The large majority of companies are flying without instrumentation in the one channel that is reshaping how B2B buyers form their shortlists.

This post covers what AI brand monitoring actually involves, what you should be measuring, how to run a baseline protocol without any paid tools, and what the data should tell you. The goal is not to add another dashboard to your stack. The goal is to tell you where your brand stands in AI-generated answers before you find out from a lost deal.

Key takeaways

AI brand monitoring is not traditional brand monitoring. You are tracking whether AI engines cite you, recommend you, and represent you accurately, not just whether your name appears online.
The four core metrics are citation frequency, brand representation accuracy, AI share of voice versus competitors, and the types of queries that surface your brand.
A manual monitoring protocol running monthly takes under two hours and requires no paid tools. It tells you your baseline position and where gaps exist.
The platforms to monitor are not identical. ChatGPT drives 87.4 percent of AI referral traffic according to Conductor's AI search benchmark, but Perplexity and Gemini behave differently and should be audited separately.
AI visibility is volatile. Research tracking 2,500 prompts across multiple verticals found that even top brands see month-to-month fluctuation in AI mentions. Only the most stable brands held below 20 percent monthly volatility in AI share of voice, according to the Semrush AI Visibility Index.
Monitoring without a fix plan is useful only for diagnosis. The mechanism that actually changes AI representation is earned media in publications AI engines trust.

Why AI brand monitoring is different from what you've been doing

Traditional brand monitoring tracks mentions: in press, social media, review platforms, forums. The logic is that more mentions means more reach. Sentiment analysis tells you if the mentions are positive or negative. Volume tells you if coverage is growing.

AI brand monitoring tracks something structurally different. The question is not how often your brand is mentioned across the web. The question is whether, when a human asks ChatGPT or Perplexity or Gemini a question your brand should be in the answer to, your brand actually appears and how it is represented when it does.

This distinction matters because AI systems are not search engines in the conventional sense. They do not return a ranked list of web pages. They synthesize an answer from sources they consider authoritative, then either cite those sources or fold them into the response without attribution. A 2025 analysis published on arXiv found that LLMs systematically consume relevant web content without providing adequate citation, and major LLM vendors reveal little about how their retrieval-augmented generation pipelines choose which content to ingest and which to cite. The practical implication: your brand can be shaping an AI answer without your URL appearing, or your brand can be completely absent from answers where it should be central.

Both conditions are invisible to traditional monitoring. Both conditions have direct consequences for whether buyers shortlist you.

What to measure: the four metrics

Citation frequency

How often does your brand appear when AI engines answer queries relevant to your category? This is the most basic metric, and it varies more than most companies expect. You might appear frequently in ChatGPT answers but rarely in Perplexity, or vice versa, because the platforms have different source preferences and different update cycles for their underlying knowledge.

Measure this by running a fixed set of queries monthly across each platform and recording whether your brand appears, where in the response it appears (first mention versus later mention versus citation link only), and whether the mention is as a primary recommendation or a comparison to something else.

Brand representation accuracy

When AI engines do mention your brand, what do they say? This is more fragile than citation frequency because training data is static until updated, and the web content that AI engines indexed about your company may be outdated, incomplete, or inaccurate. AI systems can describe your pricing model incorrectly, misattribute founding dates, describe services you discontinued, or conflate you with a different company in your category.

Inaccurate AI representation is not a niche problem. A peer-reviewed study published on arXiv analyzing GPT-4's citation and reference behavior found that LLMs internalize citation networks from training data in ways that reflect the biases and recency limitations of that data. What this means for brand representation: if the web content about your company that AI engines indexed is six months old and your positioning has shifted since then, the AI may be representing a version of your company that no longer exists.

AI share of voice

Of the times AI engines answer a question about your category, how often are you recommended versus your competitors? Share of voice in AI answers is not proportional to share of voice in traditional search rankings. Only 7.2 percent of domains appear in both Google AI Overviews and LLM results simultaneously, according to a joint study by Search Engine Land and Fractl analyzing 22,410 unique domains. The set of companies that AI engines recommend is largely distinct from the set that ranks in traditional search. Your SEO performance is not a proxy for your AI share of voice.

Query type coverage

AI search behavior differs from keyword search behavior. Buyers use AI engines for conversational, multi-part queries: "What's the best way to handle PR for a SaaS company that doesn't want to use a traditional agency?" rather than the traditional short-form search "PR agency SaaS." Your brand may appear in responses to some query types but not others. The query types that matter most are the ones that correspond to high-intent buyer moments, not generic category awareness questions.

Tracking which query types surface your brand tells you whether you are present at the part of the buyer journey where decisions are actually forming, or only at the awareness stage.

The manual monitoring protocol

Running a manual AI brand audit takes roughly 90 minutes per month. No paid tools required. The protocol below gives you a reliable baseline and surfaces the gaps that matter.

Step 1: Build your query set

You need 15 to 20 prompts organized into three categories. Run these queries across ChatGPT, Perplexity, and Gemini each time.

Category queries test whether you appear when buyers research your space: "What are the best options for [your category]?" or "Who are the leading companies in [your category] for [company type]?" Run four to six category queries that reflect how your buyers actually describe the problem, not how you describe your solution.

Comparison queries test whether you appear when buyers are shortlisting: "How does [your company] compare to [competitor A] and [competitor B]?" or "What's the difference between [your company] and [alternative approach]?" These queries are high-intent. Appearing here versus not appearing here is often the difference between making the shortlist and not.

Brand-specific queries test whether AI represents your brand accurately when directly asked: "What does [your company] do?" "Who founded [your company]?" "Is [your company] right for [specific use case]?" These surface representation accuracy problems.

Step 2: Run the queries and record outputs

For each query and each platform, record: whether your brand appears (yes or no), where in the response (first mention, secondary mention, citation link only, not present), the verbatim description used, and which competitors appear in the same response.

Use a simple spreadsheet with columns for query, platform, your brand presence, competitor presence, and notes on any inaccuracies. Do not summarize or paraphrase. Copy the relevant text verbatim. The exact language AI engines use about your brand matters, and paraphrasing loses the detail you need to diagnose representation problems.

Step 3: Score your position

After running all queries, calculate three scores. Citation rate: the percentage of queries where your brand appeared at all. First-mention rate: the percentage of queries where your brand was the first company recommended. Representation accuracy: the percentage of brand-specific queries where the AI's description was accurate and current.

Run the same protocol monthly. The trend matters more than any single month's results because AI visibility is volatile. Research tracking brands across AI platforms over three months found meaningful fluctuation even for well-established companies. The Semrush AI Visibility Index, which monitored 2,500 real-world prompts across five verticals, found Google AI Mode brand mentions dropped four percent month-over-month in certain periods, and new brands entered the top 100 at a rate of 25 per quarter.

Platform-specific behavior you need to account for

ChatGPT

ChatGPT is where most AI-referred traffic originates. According to Conductor's analysis of 10 major industries, ChatGPT drove 87.4 percent of all AI referral sessions tracked. SimilarWeb data analyzed by Search Engine Land shows ChatGPT's market share of generative AI traffic reached 87.5 percent before gradually declining as Google AI Mode and other platforms grew, though it remains dominant by a large margin. Its brand recommendations lean heavily on sources indexed during training (its knowledge cutoff) plus real-time web search results for ChatGPT versions with browsing enabled. The implication: your presence in high-authority publications that predate ChatGPT's training cutoff gives you a baseline citation floor. Publications covered after the training cutoff require real-time browsing to surface.

Practically, this means your ChatGPT visibility is a function of your historical editorial footprint plus your recent publication record. Both matter, and both can be measured by looking at which placements ChatGPT cites when it does mention your brand.

Perplexity

Perplexity is a real-time retrieval system that searches the web for each query and synthesizes from current sources. Its citation behavior is more transparent than ChatGPT's: it shows the sources it pulled from. This makes it the most diagnostic platform for understanding which publications AI engines are treating as authoritative for your category.

When Perplexity recommends a competitor over you, the citations it shows reveal exactly which publications it drew from. That list tells you which publications you need coverage in. This is the most actionable output of a Perplexity audit.

Gemini

Gemini indexes Google's Knowledge Graph and has access to structured data, including Google Business Profiles and Wikipedia. It is more sensitive to entity recognition than the other platforms. Whether your brand exists as a clearly defined entity in Google's knowledge systems affects Gemini's representation of you significantly. Representation accuracy problems that appear in Gemini audits often trace to Knowledge Graph gaps rather than editorial footprint gaps.

When to escalate to tracking tools

The manual protocol described above gives you a reliable monthly snapshot. It does not give you daily data, competitive benchmarking across a large query set, or automated alerts when AI representation changes materially.

If any of the following are true, you need automated tracking rather than manual audits alone: you are in a category with more than three active competitors being recommended by AI engines; your sales cycle is longer than 60 days (meaning the AI visibility gap during a deal has more time to compound); or you are running active editorial or content campaigns and need to see whether they are affecting AI representation within weeks rather than months.

The core metrics for automated tracking, as identified by Search Engine Land's analysis of available tools, are citation frequency per platform, brand visibility score, AI share of voice versus named competitors, and geographic performance variance. Geographic variance matters if your market is regional or if AI engines represent your brand differently across geographies.

When evaluating tools, check whether they cover all three major platforms (ChatGPT, Perplexity, Gemini), whether they allow you to define your own query set (generic category queries miss the specific buyer language in your market), and whether they surface which sources the AI is drawing from rather than just whether you appear. Knowing that Perplexity cited a competitor in an industry category query tells you there is a gap. Knowing which publications it pulled from to form that answer tells you exactly what to do about it.

What the data actually tells you

The output of consistent AI brand monitoring converges on one of four diagnostic patterns.

The first pattern is low citation frequency with accurate representation when you do appear. This means AI engines have a positive signal about your brand but it is weak. There is not enough coverage in the publications they treat as authoritative to surface you reliably. The fix is volume: more placements in more high-authority publications.

The second pattern is moderate citation frequency with representation inaccuracies. This means you have some editorial footprint but it contains outdated or incorrect information about your brand. The fix is recency: fresh placements in authoritative sources that accurately describe your current positioning, replacing the older content that AI engines have indexed.

The third pattern is high citation frequency in category queries but low citation frequency in comparison queries. This means AI engines know you exist in the category but are not including you when buyers are shortlisting. This is often a depth problem. You have awareness-level coverage but lack the in-depth coverage (case studies, founder interviews, data-backed claims) that AI engines use when forming purchase recommendations rather than category introductions.

The fourth pattern is high citation frequency but low AI share of voice. You appear, but you appear as an also-ran after competitors who are recommended first or more frequently. This is a positioning problem in your editorial footprint. The publications that AI engines are citing when they answer high-intent queries are writing about your competitors as the primary subject and mentioning you comparatively.

Each of these patterns points to a different kind of editorial intervention. None of them can be diagnosed without consistent monitoring data. And none of them can be fixed without understanding what publications AI engines are pulling from when they answer the queries that matter to your buyers.

The traffic reality check

One objection to investing in AI brand monitoring is that AI referral traffic is still a small share of total traffic. This is currently true. An analysis of 13 months of LLM referral traffic published by Search Engine Land in February 2026 found that LLM referral traffic accounts for less than two percent of total referral traffic on average, with a range of 0.15 percent to 1.5 percent across platforms including ChatGPT, Perplexity, Gemini, and others.

The reason this understates the importance of AI monitoring is conversion rate and buyer intent. The same 13-month dataset found an 18 percent conversion rate for LLM-referred sessions, meaning visitors who arrived from AI referrals converted to leads or sales at roughly 18 percent. That is not a traffic play. That is a high-intent buyer behavior pattern. The visitors who click through from an AI answer have already been pre-qualified by the AI's recommendation. They arrive with a reason to believe.

The McKinsey report on AI search behavior found that 44 percent of AI-powered search users now describe it as their primary and preferred search method. The share of Google searches with AI summaries, already at roughly 50 percent, is projected to exceed 75 percent by 2028 according to McKinsey's trend analysis. AI referral traffic is small today because many buyers are still forming the habit. Data published by Similarweb in January 2026 shows organic search traffic declining 2.5 percent year-over-year, with AI Overview presence in approximately 30 percent of SERPs reducing click-through rates by 35 percent when present. The monitoring infrastructure you build now will tell you whether you are in a position to capture that shift as it accelerates, or whether you will be watching competitors take it.

Why monitoring leads back to earned media

Running AI brand audits consistently produces a specific realization: the companies that appear most reliably, most accurately, and most favorably in AI-generated answers are the companies with the strongest editorial footprints in publications that AI engines treat as authoritative. This is not a correlation that requires AI-specific explanation. It is the same mechanism that drove PR value for decades, now applied to a different reader.

AI engines learn what is credible the same way human buyers do: from the sources they trust. The publications that shaped opinion in your industry for years are the same publications that AI engines index as authoritative. When a buyer asks ChatGPT or Perplexity who leads your category, the answer is downstream of which company has the strongest editorial presence in the sources the AI treats as ground truth. That is why monitoring without a publication strategy only tells you where you stand. It does not improve the position.

The Semrush AI Visibility Index, which analyzed 2,500 real-world prompts across five major verticals, illustrates this directly: brands with the most stable AI share of voice over its three-month tracking period were not the brands that had optimized their website for GEO or added structured data. They were the brands that had accumulated editorial coverage in publications the AI systems indexed as authoritative before the optimization conversation started. Coverage compounds. A placement in Forbes or TechCrunch or Harvard Business Review from two years ago is still being indexed, still being cited, still being used as a source when AI engines form their answers today.

This is what Machine Relations defines as the new operating layer of brand authority in the AI era: earned media in trusted publications drives AI citation, which drives buyer awareness, which drives pipeline. The mechanism is not new. PR's original value proposition was exactly this. Third-party credibility from respected publications shapes how buyers think about a brand. What changed is that the audience is no longer only human. AI systems are doing the first cut of research before human buyers even engage. Machine Relations is the discipline of ensuring your editorial footprint speaks to that machine reader, not just the human one.

Your monitoring data tells you where the gaps are. Closing them requires coverage in the right publications. Not more content on your own website, not better SEO metadata, not technical GEO — a distribution tactic within Layer 4 of the Machine Relations framework — optimizations. Those things matter at the margin. The primary signal AI engines use to determine credibility is the same signal they have always used: whether respected, third-party sources have covered your brand as worth covering.

For companies that have not yet built systematic AI monitoring, the gap between their current editorial footprint and where they need to be is visible. For companies monitoring consistently, the gap is quantified and prioritizable. The difference between those two positions is the difference between reacting to lost deals and anticipating them before they happen.

If you want to understand where your brand currently stands in AI-generated answers across ChatGPT, Perplexity, and Gemini before running the manual protocol yourself, the visibility audit runs that diagnostic automatically and tells you which query types are returning your competitors instead of you.

How GEO, AEO, and SEO fit within Machine Relations

These disciplines aren't competing alternatives — they represent different layers of the same system. Machine Relations is the full architecture that contains each of them:

Discipline	Optimizes for	Success condition	Scope
SEO	Ranking algorithms	Top 10 position on SERP	Technical + content
GEO	Generative AI engines	Cited in AI-generated answers	Content formatting + distribution
AEO	Answer boxes / featured snippets	Selected as the direct answer	Structured content
Digital PR	Human journalists/editors	Media placement	Outreach + storytelling
Machine Relations	AI-mediated discovery systems	Resolved and cited across AI engines	Full system: authority → entity → citation → distribution → measurement

GEO and AEO are tactics within Layer 4 (Distribution) of the Machine Relations stack. They matter — but they operate on top of a foundation they cannot build on their own.

Frequently asked questions

How often should we run AI brand audits?

Monthly is the minimum for most companies. If you are in an active editorial campaign, running multiple placements per month, audit every two to three weeks to see whether fresh coverage is affecting AI representation within a reasonable lag. The typical lag between a new placement going live and that placement influencing AI outputs is two to six weeks, depending on the platform and how frequently it updates its source index.

What queries should we prioritize if we can only monitor a small set?

Prioritize comparison queries over category queries. The question "How does [your company] compare to [competitor]?" is closer to buyer intent than "Who are the best companies in [your category]?" Buyers who ask comparison questions are further down the evaluation funnel. If you are absent from comparison query answers, you are missing buyers who are already considering purchasing. That is the gap with the most direct pipeline consequence.

Our brand appears in AI answers but the description is wrong. What fixes that?

AI representation accuracy is a content problem, not a technical settings problem. The fastest path to correcting inaccurate AI representation is new, accurate coverage in high-authority publications that AI engines index and trust. A press release does not fix this. A blog post on your own website does not fix this. A bylined article or a feature in a publication the AI engine considers authoritative, describing your company accurately and with current information, is what shifts the representation over time. If the inaccurate description is appearing consistently across multiple AI platforms, that tells you the inaccurate source is widely indexed. The fix requires a more authoritative source that gives AI engines accurate information to pull from instead. The companion guide on fixing brand sentiment in AI search covers this in full detail.

Do we need to monitor every AI platform?

At minimum, audit ChatGPT, Perplexity, and Gemini. These three platforms have different source preferences, different update cycles, and different citation behavior. Your performance on one does not predict your performance on the others. The Search Engine Land and Fractl study found that only 7.2 percent of domains appearing in LLM results also appeared in Google AI Overviews, confirming that these are distinct systems with distinct source preferences. Monitoring one tells you about one. If you are resource-constrained, start with Perplexity because it shows its citations. That output data tells you the most about where editorial gaps exist.

How do we measure ROI from AI brand monitoring?

The most direct measurement is tracking the conversion rate of LLM-referred sessions in your analytics. Set up source tagging that distinguishes ChatGPT referrals, Perplexity referrals, and Gemini referrals from each other and from other organic sources. The 13-month dataset referenced earlier found an 18 percent conversion rate for LLM-referred sessions on average. If your LLM-referred conversion rate is significantly below that, you have a representation or positioning problem in AI answers, not just a visibility gap. You are appearing, but appearing in a context that does not convert. That diagnostic requires looking at what AI engines are actually saying when they mention you, not just whether they mention you.

Separately, track competitor AI share of voice alongside your own. If a competitor's AI visibility increases in the same period that your deal velocity slows in a particular segment, that correlation is worth investigating. It will not always be causal, but the monitoring data gives you the means to ask the question rather than guessing. For a full breakdown of how AI share of voice is measured and what drives movement in it, see the guide on AI share of voice.

Start your visibility audit →