What is AEO and why does it matter now?

Answer Engine Optimization is the practice of structuring your brand so AI models recommend you when buyers ask for solutions in your category. Traditional search puts you on a results page, while AI search gives the buyer a direct answer where your brand is either named or it is not.

How is Windgrove different from an SEO agency?

SEO gets you ranked on Google, while AEO gets you recommended by AI. Windgrove targets the three inputs that determine whether an AI recommends you: structured content, entity consistency, and citation authority.

How long does it take to see results?

First AI citations typically appear within 45 to 60 days, sometimes earlier. Broader category terms take 1 to 3 months as authority compounds with ongoing content, fresh citations, and stronger entity signals.

What AI platforms does Windgrove optimize for?

Windgrove focuses on improving visibility in major answer engines and AI assistants, including ChatGPT, Perplexity, and Gemini, by improving the structured content and citation signals those systems rely on.

AEO

How to Do Competitor Benchmarking for AEO (With Concrete Examples)

Spencer DukeMay 15, 202610 min read

AEO competitor benchmarking is the process of systematically measuring how often your competitors appear in AI-generated answers, which queries they own, and what signals are driving their citations, so you can close the gap or pull ahead.

Executive SummaryAI search engines like ChatGPT, Perplexity, and Google AI Overviews process an estimated 2.5 billion AI-assisted queries daily, and in most categories, just two or three brands dominate the citations. Benchmarking tells you who those brands are and why.The core benchmarking process involves four steps: building a query set, running systematic prompt tests across multiple LLMs, scoring share of voice, and auditing the content signals that explain your competitors' visibility.Standard SEO tools miss roughly 37% of product discovery queries happening in ChatGPT and Perplexity, which means competitor benchmarking for AEO requires a separate, dedicated process.The output of a benchmarking exercise is not a report. It is an action list: specific content gaps to fill, entity signals to strengthen, and third-party sources to earn mentions on.

Most marketing teams first encounter AEO competitor benchmarking the same way: someone runs a buyer query in ChatGPT, sees a competitor recommended by name, and asks "why them and not us?" That moment of recognition is useful, but it is not a strategy. A single prompt test tells you almost nothing about the competitive landscape. A systematic benchmarking process tells you everything.

The stakes are real. According to Conductor's 2026 AEO/GEO Benchmarks Report, 97% of CMOs reported positive business impacts from AEO in 2025, and 94% plan to increase investment in it. AI visibility is becoming its own performance channel, one that determines which brands are trusted enough to enter the answer before anyone clicks. Brands that establish citation authority early accumulate compounding advantages: AI systems cite frequently cited brands more often, creating a flywheel that late movers will struggle to break into.

This guide walks through the exact process for benchmarking your competitors' AEO performance, with concrete examples at every step.

Step 1: Build Your Benchmark Query Set

Before you can measure anything, you need a defined set of queries. The queries you choose determine what competitive landscape you are actually measuring. Use the wrong queries and your benchmark reflects a market that does not exist.

Your query set should represent how real buyers research your category, not how your marketing team talks about it.

A useful benchmark query set has three layers:

Category queries: Broad questions about the problem space. Example: "What is the best project management software for agencies?" or "How do I choose a B2B data provider?"
Comparison queries: Queries that pit solutions against each other. Example: "What is the difference between Tool A and Tool B?" These are high-value because AI engines frequently synthesize direct comparisons.
Use-case queries: Specific scenarios your buyers face. Example: "What tool should I use to track AI citations for my brand?" or "Best CRM for a 50-person SaaS company."

How Many Queries Do You Need?

For a meaningful baseline, aim for 30 to 50 queries. This is enough to identify patterns without the data collection becoming unmanageable. If you have 100 queries, manual auditing alone takes 8 to 12 hours per month across four platforms. Start with 30, then expand once your tracking system is in place.

The best source for queries is not a keyword tool. It is your sales team. The questions buyers ask on discovery calls, the objections they raise, and the comparisons they mention are exactly the queries that trigger AI recommendations at the research stage of the buying journey.

Step 2: Run Systematic Prompt Tests Across Multiple LLMs

Once your query set is ready, run each query across ChatGPT, Perplexity, Claude, and Gemini. Each platform has different citation behaviours, and benchmarking on only one engine gives you a dangerously incomplete picture.

Platform behaviour varies significantly. According to Semrush's AI search trends data, ChatGPT now handles 1 billion daily queries and accounts for 77% of AI referral traffic. But Perplexity and Microsoft Copilot include external source links in over 77% of responses, while ChatGPT links in approximately 31% of responses. Claude does not include URL citations at all. A competitor can be invisible on Perplexity but dominant in Claude's prose recommendations, and you would miss it entirely if you only checked one engine.

What to Record for Each Query

For every query on every platform, capture the following in a spreadsheet:

Field	What to Record
Query text	Exact prompt used
Platform	ChatGPT / Perplexity / Claude / Gemini
Date	For tracking changes over time
Your brand cited?	Yes / No
Your brand position	First mentioned, listed, or absent
Competitor brands cited	All brand names that appeared
Competitor positions	First, second, third, etc.
URL cited?	Domain linked (Perplexity primarily)
Sentiment	Positive / neutral / negative framing

A Concrete Example

Imagine you sell B2B marketing analytics software. You run the query "What is the best marketing analytics tool for a B2B SaaS company?" across all four platforms. The results look like this:

ChatGPT: Mentions three competitors by name. You are not mentioned. One competitor is described as "the industry standard for attribution."
Perplexity: Cites five tools. You appear fourth. A competitor appears first with a source link to their comparison guide.
Claude: Recommends two tools in detail. You are not mentioned. One competitor is described as "particularly strong for pipeline analytics."
Gemini: Mentions four tools. You appear second.

This single query run reveals that you have a citation gap on ChatGPT and Claude, that one competitor owns the "attribution" framing, and that another competitor's comparison guide is being cited as a source. Each of those observations points to a specific action.

Run this process for your full query set once a month. Consistency matters more than frequency. The same queries, the same platforms, the same recording format, every month.

Raw data from your prompt tests is not a benchmark. You need to convert it into two comparable metrics: citation rate and AI share of voice.

Citation rate is the percentage of your total queries where your brand appeared in the AI response. If you ran 50 queries across four platforms (200 total data points) and your brand appeared in 30 of them, your citation rate is 15%. This is your North Star metric for AEO. Unlike a keyword ranking, which measures one page for one query, citation rate reflects how AI systems perceive your authority across the full breadth of your category.

AI share of voice measures your brand's mentions as a proportion of all brand mentions across the same query set. If you and your three main competitors were mentioned a combined 120 times, and your brand accounted for 18 of those mentions, your AI share of voice is 15%. This tells you not just whether you are cited, but whether you are winning the category relative to the competition.

Only 1% of users click on AI summary links, according to Semrush's AI search research. That makes citation rate the primary value metric, not click-through. Being named is the win.

Scoring the Benchmark

Once you have these numbers for yourself and your competitors, build a simple comparison table:

Brand	Citation Rate	AI Share of Voice	Avg. Position	Perplexity URL Citations
Your Brand	15%	15%	3.2	4
Competitor A	44%	41%	1.4	18
Competitor B	28%	26%	2.1	9
Competitor C	9%	8%	4.0	1

This table immediately shows you the gap. Competitor A has a citation rate nearly three times higher and is consistently named first. That is not a content problem. That is an authority and entity recognition problem, which requires a different set of interventions than simply writing more blog posts.

Step 4: Audit the Signals Behind Your Competitors' Citations

Knowing that a competitor has a 44% citation rate is only half the job. The other half is understanding why. You need to audit the content signals that explain their citation advantage. This is where benchmarking moves from measurement to diagnosis.

Check Which Sources AI Engines Are Citing

On Perplexity, every response includes source links. When a competitor appears in the answer, note which domains are being cited alongside them. These are the third-party sources the AI trusts. Common patterns include:

Industry publications: G2, Capterra, TechCrunch, and vertical-specific trade outlets
Comparison guides: Third-party "best of" roundups that name multiple vendors
Their own structured content: Glossary pages, how-to guides, and FAQ pages with schema markup

If a competitor is cited repeatedly alongside the same three or four domains, those domains are functioning as credibility anchors. Earning a mention on those same sources is one of the fastest ways to improve your own citation rate.

Analyse the Framing AI Engines Use

Pay close attention to the language AI engines use when describing your competitors. Phrases like "industry standard for attribution" or "particularly strong for pipeline analytics" are not random. They reflect the dominant narrative that exists across the web about that brand. AI systems synthesize these descriptions from the content they have indexed.

When you see a competitor consistently described with a specific phrase, search for that phrase on Google. You will almost always find it in a third-party review, a case study, or a comparison article. That is the content driving the framing, and it points directly to the gap you need to fill.

Review Their Owned Content Structure

Visit the top pages on a competitor's site that are being cited. Look for three things:

Schema markup: Are they using FAQ, HowTo, or Article schema? Structured data makes content easier for AI systems to parse and extract.
Fact density: Research cited in AEO studies shows content with expert quotes improves AI visibility by 41% and statistics improve it by 30%. Competitors with high citation rates almost always publish content that leads with specific data points, not general claims.
Answer formatting: Do their pages open with a concise, direct answer to the query? AI engines extract answers most reliably from content that answers the question in the first two to three sentences of a section.

The diagnostic question is simple: What does their content do that yours does not? The answer becomes your content brief.

Step 5: Turn the Benchmark Into an Action List

A benchmarking exercise that ends with a spreadsheet has failed. The entire point is to generate a prioritised list of actions. Here is how to convert your findings into work.

Map Your Content Gaps

Cross-reference your query set against your citation data. For every query where a competitor is cited and you are not, ask: does a page on your site directly answer this query? In most cases, the answer is no. That is your content gap list.

Prioritise gaps by two factors: query frequency (how often does this type of question appear in your set?) and competitor advantage (how consistently does a specific competitor own this query?). The highest-priority gaps are the ones where a single competitor dominates across multiple platforms and you have no dedicated content.

Build one page per gap. Not a blog post that mentions the topic in passing. A dedicated page that opens with a direct answer, uses structured data, and cites verifiable sources. According to Conductor's benchmarks research, AI traffic converts at roughly twice the rate of traditional traffic in a third of sessions, which makes content built for AI citation a high-ROI investment, not a vanity exercise.

Strengthen Your Entity Signals

If your citation rate is low across the board, the issue is often entity recognition, not content quality. AI engines need to understand what your brand is, what category it belongs to, and what it is known for. Inconsistent descriptions across your site, your LinkedIn page, your G2 profile, and third-party mentions create ambiguity that suppresses citations.

Audit your brand description across every public surface. Your homepage, your About page, your G2 and Capterra profiles, your press mentions, and your partner pages should all use consistent language to describe what you do and who you serve. This consistency is what AEO practitioners call entity coherence, and it is one of the clearest signals that separates frequently cited brands from invisible ones.

Earn Mentions on the Sources AI Trusts

Your prompt test data will reveal which third-party domains are being cited most often in your category. These are your target publications. A mention on a domain that AI engines already trust is worth more than ten pages of owned content.

Prioritise three types of placements:

Review platforms: A complete, updated profile on G2 or Capterra with specific use-case language gets indexed and cited regularly by AI engines.
Comparison roundups: Reach out to authors of "best of" guides in your category. A single inclusion in a well-cited roundup can move your citation rate meaningfully within one to two months.
Trade publications: A contributed article or expert quote in a vertical publication creates a citable source that AI engines can extract and attribute to your brand.

Set a Monthly Tracking Cadence

Benchmarking is not a one-time audit. It is a recurring process. Run your full query set once a month, update your share of voice table, and track whether your interventions are moving the numbers.

The leading indicator to watch is citation rate on the platforms where you have the biggest gap. If you publish a dedicated page targeting a specific query gap, you should expect to see movement within six to ten weeks as AI engines re-index your content and adjust their responses.

The benchmark is only as useful as the actions it generates. Set a monthly review date, assign ownership of each gap to a specific team member, and treat citation rate the same way you treat keyword rankings: a metric that requires consistent effort to move.

Frequently Asked Questions

What is competitor benchmarking for AEO?

Competitor benchmarking for AEO is the process of measuring how often rival brands appear in AI-generated answers, which queries they own, and what signals drive those citations. The goal is to identify gaps in your visibility and turn them into a concrete action plan, rather than reacting to individual prompt results.

Which AI platforms should you benchmark?

Benchmark across the platforms your buyers actually use, typically ChatGPT, Perplexity, Claude, and Gemini. Each system behaves differently: Perplexity cites sources directly, Claude relies on prose recommendations without URLs, and ChatGPT accounts for the majority of AI referral traffic. Looking at one engine alone gives you an incomplete view of your competitive position.

What metrics matter most in AEO benchmarking?

The two core metrics are citation rate and AI share of voice. Citation rate shows how often your brand appears across your test queries. Share of voice shows how much of the total brand visibility in your category belongs to you versus competitors. Both matter more than AI referral traffic volume, because visibility precedes clicks.

How many benchmark queries do you need?

A useful starting point is 30 to 50 queries. That gives you enough coverage to see patterns across category, comparison, and use-case intent without making the audit too heavy to run monthly. Expand the set once your tracking process is established and you have capacity to act on the additional data.

What should you do with the benchmark results?

Use the results to build a prioritised action list: content pages to create for each query gap, entity signals to align across your public profiles, and third-party sources to earn mentions on. A benchmark that ends with a report and no assigned actions is wasted effort. The output should be a fix list with owners and deadlines.