llms.txt Explained: What It Is, Why It Matters, and Why Google Now Checks for It

Executive Summaryllms.txt is a plain-text file that lives at the root of your website and tells AI crawlers what your site is about, which pages matter, and how to read your content efficiently.As of May 2026, Google's Chrome Lighthouse now includes a dedicated "Agentic Browsing" audit that checks whether your site has an llms.txt file, signalling that AI readiness has moved from optional to measurable.AI search traffic converts at 14.2% compared to Google's 2.8%, making it 5x more valuable per visitor. Yet most business websites are still invisible to AI engines because their technical foundation was never built for them (Exposure Ninja, 2026).Windgrove helped Opal go from 0% AI visibility to 15.9% and 1,766 brand mentions across LLMs in just 31 days. llms.txt configuration was one of the first technical steps in that process.
Most business owners have never heard of llms.txt. That is not surprising. It is a relatively new file format, proposed in 2024 and still gaining mainstream awareness. But the gap between "haven't heard of it" and "need to act on it" is closing fast.
In May 2026, Google added a check for llms.txt to Chrome's Lighthouse auditing tool — the same tool developers and marketers use to measure site performance and technical health. The new "Agentic Browsing" category evaluates how well your site is structured for machine interaction, and the presence of an llms.txt file is one of the explicit signals it looks for.
That is a meaningful shift. Google has publicly said llms.txt is not required for traditional search rankings. But Chrome is now flagging its absence as a readiness gap for AI agents. Those two things can both be true at once, and understanding the distinction is exactly what this article is about.
The core question: If your buyers are increasingly using ChatGPT, Perplexity, and Google AI Overviews to find products and services like yours, does your site give those systems what they need to find, read, and recommend you?
For most sites, the honest answer is no. llms.txt is one part of fixing that.
What Is llms.txt?
llms.txt is a plain-text file placed at the root of a website — typically accessible at yourdomain.com/llms.txt — that provides AI language models with a structured, machine-readable summary of your site's content and purpose.
Think of it as a table of contents written specifically for AI systems. A sitemap tells search engine crawlers which pages exist. llms.txt goes further. It tells AI agents what your site is actually about, which pages are most important, and how to interpret your content without crawling every page from scratch.
The Analogy That Makes It Click
robots.txt tells crawlers what they can and cannot access. llms.txt tells AI agents what is worth their attention and how to understand it.
The format was first proposed in 2024 by Jeremy Howard of Answer.AI as a community standard, not a Google-mandated protocol. It is written in simple Markdown and typically contains:
- A brief description of the organisation and what it does
- Links to the most important pages on the site (product pages, documentation, key articles)
- Optional context about content categories, intended audiences, and how the site is structured
Here is a simplified example of what an llms.txt file looks like in practice:
# Acme Software
> Acme builds project management tools for remote engineering teams.
## Core Pages
- [Product Overview](https://acme.com/product): What Acme does and who it is for
- [Pricing](https://acme.com/pricing): Plans and pricing for teams of all sizes
- [Documentation](https://acme.com/docs): Full technical documentation
## Blog
- [How we reduced onboarding time by 40%](https://acme.com/blog/onboarding)
- [Remote team productivity guide](https://acme.com/blog/remote-teams)
That is it. No code. No developer required. A text file with a clear structure.
What llms.txt Is Not
It is not a replacement for robots.txt, which controls crawler access permissions. It does not override your sitemap. And it does not directly determine whether you rank in Google's traditional blue-link results.
What it does is reduce the cognitive load on AI agents that are trying to understand your site quickly. Google's own Lighthouse documentation states it plainly: "Without llms.txt, agents may spend more time crawling the site to understand its high-level structure and primary content."
Less time crawling means faster comprehension. Faster comprehension means a higher likelihood of accurate, confident citation.
Why Google Now Checks for It
On 20 May 2026, Search Engine Land reported that Google added llms.txt detection to Chrome's Lighthouse auditing tool under a new category called "Agentic Browsing." This is the same Lighthouse tool that measures Core Web Vitals, accessibility, and SEO performance — the scores that developers and marketers already track closely.
The new category does not produce a traditional 0-100 score. It surfaces a pass/fail ratio across a set of agentic readiness signals. The checks include:
- WebMCP integration
- Accessibility tree integrity
- Layout stability through Cumulative Layout Shift (CLS)
- Presence of an llms.txt file
The framing matters. Google is not checking for llms.txt because it affects traditional search rankings. It is checking because Chrome is increasingly used by AI agents, and agents benefit from sites that are easier for machines to navigate.
The Nuance Worth Understanding
Google's John Mueller addressed this directly in May 2026, responding to a question from SEO expert Lily Ray about the apparent contradiction between Google's own use of llms.txt and its official guidance that the file is not needed for search:
"The short answer is that it's not done for search. There's more to websites than just SEO. It's worth separating 'discovery' (finding the website or pages with a global search engine) vs 'functionality' (once someone has found the page, helping them to best do the task they want to do)."
This is a useful distinction. llms.txt does not help Google find your site. It helps AI agents use your site more effectively once they arrive. For businesses whose buyers increasingly interact with AI tools, that functional layer is exactly where visibility is won or lost.
Lighthouse Flags AI Readiness Gaps: Google's Lighthouse now flags the absence of llms.txt as a gap in your site's machine readiness. Whether or not it affects your ranking today, it signals where the web is heading. Sites that are easy for AI agents to read will be cited more confidently than sites that are not.
The same Lighthouse category also emphasises accessibility tree integrity and layout stability — signals that AI agents rely on as their "primary data model" when navigating pages. llms.txt is one layer of a broader machine-readability picture.
Why This Matters for Your Business Right Now
The timing of Google's Lighthouse update is not coincidental. It reflects a broader shift in how buyers find and evaluate products and services.
51% of B2B software buyers now start their research with an AI chatbot more often than with Google (G2, April 2026). That number has moved from a prediction to a reality. If your site is not structured for AI comprehension, you are invisible when it matters most. That is when your buyer is forming their shortlist.
The traffic quality argument makes this even more urgent. According to Exposure Ninja's 2026 AI Search Statistics report, AI search traffic converts at 14.2% compared to Google's 2.8%. That means a visitor arriving from an AI-generated answer is roughly five times more likely to convert than one arriving from a traditional search result.
The Scale of What AI Agents Are Already Doing
The 2026 State of AI Traffic report from Human Security documented a fundamental shift in how automated systems interact with the web:
- Monthly volumes of AI-driven traffic grew 187% from January to December 2025, nearly tripling over the calendar year.
- Traffic from AI agents and agentic browsers grew 7,851% year over year
- OpenAI's bots alone accounted for approximately 69% of all observed AI bot traffic in 2025
These are not future projections. This is what already happened in 2025. AI agents are reading, navigating, and increasingly transacting on the web at a scale most businesses have not yet accounted for in their technical infrastructure.
The Conversion Gap Is Already Costing You
Here is the scenario that plays out every day for businesses without proper AI infrastructure:
- A buyer asks ChatGPT or Perplexity: "What's the best [your category] tool for [your ICP]?"
- The AI generates a confident answer naming three or four competitors.
- Your brand is not mentioned. Not because your product is inferior, but because the AI could not efficiently read and understand your site.
- The buyer never visits your site. The deal goes elsewhere.
llms.txt alone does not solve this. But it is part of the technical foundation that makes your site legible to the systems your buyers are now relying on. And it is one of the fastest, lowest-effort technical fixes available.
What Goes Into a Well-Configured llms.txt File
There is no single mandatory format for llms.txt. The community standard proposed by Jeremy Howard of Answer.AI in 2024 provides a flexible Markdown-based structure, and most implementations follow a similar pattern.
A well-configured llms.txt file typically includes four components:
Component | Purpose | Example |
|---|---|---|
H1 heading | Identifies the site or organisation |
|
Blockquote description | One-sentence summary of what the site does |
|
Section links | Curated list of key pages with short descriptions | Core pages, documentation, blog articles |
Optional context | Notes on content type, audience, or site structure |
|
What to Include and What to Skip
The goal is not to list every page on your site. It is to give AI agents a fast, accurate orientation. Think of it as the briefing document you would hand someone before they read your entire website.
Include:
- Your homepage and core product or service pages
- Pricing page (if public)
- Key blog articles or resources that represent your expertise
- Contact or booking pages (so AI agents can recommend the right next step)
- Any comparison or "vs" pages you have published
Skip:
- Privacy policy, terms of service, and legal pages
- Admin or login pages
- Thin or duplicate content
- Pages under active construction
The llms-full.txt Variant
Some sites also publish an llms-full.txt file, which contains the full text content of key pages rather than just links. This is particularly useful for documentation-heavy sites where AI agents frequently need to extract detailed technical information. For most business websites, the standard llms.txt is sufficient.
One important note: llms.txt is a guide, not a command. AI agents can choose to follow it or ignore it. But providing the file removes friction from the process. An agent that has a clear map of your site will almost always produce a more accurate, more confident summary of what you do. That accuracy is what drives citation.
llms.txt in Practice: The Opal Case Study
Understanding what llms.txt does in theory is one thing. Seeing what happens when it is part of a complete AI visibility overhaul is another.
In late March 2026, Windgrove began working with Opal, a charge card and spend management platform built for digital marketing agencies. The situation was stark: a strong product with zero AI search visibility. Opal had only four indexed pages, no blog, weak metadata, and no sitemap submitted to Google Search Console. Anyone searching their category in ChatGPT, Perplexity, or Google AI Overviews was finding only competitors.
The Technical Foundation First
Before publishing a single piece of content, Windgrove cleared every blocker between Opal's pages and AI crawlers. The work included:
- Submitting an XML sitemap to Google Search Console
- Overhauling and redeploying robots.txt
- Building and configuring llms.txt for LLM indexing and citation
- Rewriting meta titles and descriptions across every core page
- Restructuring heading hierarchy and internal linking
- Resolving indexing issues on key product and landing pages
llms.txt was not an afterthought. It was part of the first wave of technical work, alongside the sitemap and robots.txt. The goal was to make Opal's site fully legible to AI systems before any content was published on top of it.
The Results After 31 Days
The full Opal case study documents what happened next:
Metric | Before | After (31 days) |
|---|---|---|
AI Visibility Score | 0% | 15.9% |
Brand Mentions (LLMs + web) | 0 | 1,766 |
Site Health Score | 66.2 | 80.7 (top 10% of benchmarked sites) |
AEO Articles Live | 0 | 8 |
Average LLM Position | — | #1 |
Within one week of launching the Ad Pay page, Opal ranked #2 for "ad pay" and #2 for "ad spend cards" — bottom-funnel terms where buyers are already in-market and evaluating options.
The key insight from Opal's results: Most sites take three to six months to see meaningful movement on bottom-funnel terms. Opal was there in seven days. That is what a clean technical foundation does before content is built on top of it.
Opal is now being surfaced inside Perplexity AI responses as a named recommendation. Not as an ad, not as a sponsored result. As the answer. The 15.9% AI visibility score reached in 31 days is a starting point, not a ceiling. Content compounds on a strong technical foundation. It stagnates on a broken one.
You can see Windgrove's broader track record of results at windgrove.ai/proof.
How to Get Started With llms.txt
Creating an llms.txt file is not technically complex. The challenge is doing it strategically — knowing which pages to include, how to describe your organisation accurately, and how to structure the file so it actually improves AI comprehension rather than just existing as a checkbox.
The Basic Steps
- Audit what you have. Before writing the file, map your site's most important pages. Product pages, service pages, key blog articles, your about page, and your contact or booking page are the typical starting points.
- Write a clear one-sentence description. This is the most important line in the file. It tells AI agents what your organisation does, who it serves, and what makes it distinct. Vague descriptions produce vague citations.
- Create the file in Markdown. Use a plain text editor. The format is simple: an H1 with your brand name, a blockquote with your description, and organised sections of links with short annotations.
- Place it at your domain root. The file must be accessible at
yourdomain.com/llms.txt. It should not require authentication to access. - Keep it updated. As your site evolves — new products, new content, new pages — your llms.txt should reflect those changes.
What llms.txt Cannot Do Alone
A common mistake is treating llms.txt as a standalone fix. It is not. It is one component of a broader technical and content infrastructure that AI engines use to evaluate whether your site is worth citing.
The full picture includes:
- XML sitemap: ensures AI crawlers can discover all your pages
- robots.txt: controls which pages crawlers can access
- Schema.org structured data: tells AI engines exactly what each piece of content is, who wrote it, and what it is about
- Canonical URL resolution: prevents AI engines from treating duplicate pages as separate entities
- Author attribution on content: a key trust signal for citation
- AEO-optimised content structure: articles that front-load direct answers so AI engines can extract and cite them accurately
llms.txt improves the efficiency of AI comprehension. The rest of the stack determines what there is to comprehend. Both matter. You can read more about how these layers work together on the Windgrove blog.
Frequently Asked Questions
Does llms.txt affect my Google search rankings?
No. Google has confirmed that llms.txt does not influence traditional blue-link search rankings. Google's John Mueller stated in May 2026 that llms.txt is "not done for search" and is instead about functionality — helping AI agents use your site more effectively once they arrive. The two systems are separate. llms.txt is an AI readability signal, not an SEO ranking factor.
How long does it take to create an llms.txt file?
For a straightforward business website, creating a basic llms.txt file takes between 30 minutes and two hours. The file itself is plain text written in Markdown — no coding required. The more time-consuming part is deciding which pages to include and writing a precise one-sentence description of your organisation. A vague description produces vague AI citations, so that line is worth getting right.
Will AI engines like ChatGPT and Perplexity automatically read my llms.txt file?
llms.txt is a guide, not a command. AI agents and crawlers can choose to follow it or ignore it. That said, major AI platforms including OpenAI's crawler (OAI-SearchBot) and Perplexity's crawler are increasingly designed to look for and use llms.txt files when present. Providing the file removes friction and improves the accuracy of how AI systems describe and cite your business.
Is llms.txt enough to make my site visible in AI search results?
No. llms.txt improves AI comprehension efficiency, but it is one layer of a broader technical and content stack. A complete AI-ready site also needs a properly configured XML sitemap, clean robots.txt, Schema.org structured data, resolved canonical URLs, author attribution on content, and AEO-optimised articles that front-load direct answers. llms.txt without the rest of the stack is like having a clean front door on a building with no address.
How do I know if my site has the right AI visibility infrastructure in place?
The fastest way is a structured audit. Windgrove offers a free AI visibility audit that covers technical infrastructure gaps. That includes llms.txt presence, schema implementation, canonical issues, and content structure. You get a clear picture of exactly what is preventing AI engines from reading and citing your site. If you want to understand the full scope of what is possible, you can also book a free consultation with the Windgrove team.