Agent-Ready: Why the Same Site Scores 23 or 91

Two popular agent-readiness scanners will score the same company 68 points apart. Fern gives Mintlify's docs a 91. Cloudflare's scanner gives mintlify.com a 23. Same site. Two tools. Two verdicts. It is happening everywhere - across fourteen other companies scanned for this piece, across documentation subdomains and marketing roots, across companies whose whole product is making sites legible to AI agents.
The pattern is not a bug. Different scanners measure different things, and nobody writing "is your site agent-ready?" tells you that up front. This piece is the map.
Key Takeaways
- "Agent-ready" is not one thing: two popular scanners give the same company a 68-point gap (Mintlify: Fern 91/A, Cloudflare 23/L1).
- Cloudflare's ladder runs from L1 through L5 "Agent-Native" - two of those level names (L4 "Agent-Integrated", L5 "Agent-Native") are not in Cloudflare's public blog.
- OpenAI's developer docs score an F (59/100) on Fern's scanner. The largest LLM vendor is not docs-agent-ready by the measure built for that question.
- Three dimensions fragment "agent-ready": reach (protocols), read (content), trust (identity and intent). Twelve site-scanners exist today; none cover all three.
- Before asking "is my site agent-ready?" ask "for which agents, and in what sense?" Run your site through both scanners, then scan your docs subdomain too - expect different answers.
Why do two agent-readiness scanners give the same company a 68-point gap?
They measure different things. Cloudflare's scanner rewards protocol adoption - MCP Server Card, OAuth discovery, Content Signals. Fern's rewards content accessibility - markdown negotiation, page size, rendering strategy. Mintlify passes one and skips the other. Same site, two scanners, 68 points apart. It is happening everywhere.
Mintlify sells documentation software built for AI agents. Their docs pass Fern's Agent Score at 91/A. Their own marketing site scores 23 on isitagentready.com - Level 1, "Basic Web Presence." A tool-builder failing a different tool-builder's scanner. One company, two verdicts, 68-point gap.
On April 18, Dachary Carey - author of the Agent-Friendly Documentation Spec that powers Fern's scanner - named the pattern. "They're not measuring the same thing," she wrote, "and the gaps can be dramatic." She gave one example: her own site, 100/100 on Fern, 33 on Cloudflare. She walked through which Fern checks passed, which Cloudflare checks failed, and why - a side-by-side of the same URL through two definitions of "ready."
Her map uses three dimensions - content accessibility, protocol adoption, agent experience. This piece frames it slightly differently: reach, read, trust. Her "content accessibility" is our "read." Her "protocol adoption" is our "reach." Her third dimension - "agent experience" - asks whether the agent succeeds at its task, which is a measurement no site scanner can run. This piece stays on what scanners measure: the publish side. Trust, in our frame, is named by a different set of vendors entirely, and no mass-market scanner measures it yet.
The pattern holds across fourteen more site scans run for this piece. Most score well on one scanner and poorly on the other. A handful score well on both. Almost none are past Level 3 on Cloudflare's ladder unless their documentation lives on a separate subdomain that was built with agents in mind. That last pattern matters: documentation hostnames - docs.X.com, developers.X.com, platform.X.com - are usually built by teams that think about agent consumers. Marketing roots are usually not. Same company, different surface, different audience.
What is each scanner actually measuring?
Cloudflare's scanner probes the protocol stack - .well-known endpoints, MCP cards, OAuth discovery, Content Signals. Fern's probes the content layer - llms.txt, markdown negotiation, rendering strategy, page size. Both call themselves "agent-ready." Neither names the split. The matrix below shows what that disagreement looks like across fourteen sites.
![]()
| Company | Scanned URL | CF Score | CF Level | Fern Score | Fern Grade |
|---|---|---|---|---|---|
| Cloudflare | cloudflare.com | 31 | L1 Basic | not in directory | - |
| Cloudflare | docs.cloudflare.com | 53 | L4 Agent-Integrated | 97 | A |
| Mintlify | mintlify.com | 23 | L1 | 91 | A |
| Postman | postman.com | 23 | L1 | 96 | A |
| OpenAI | platform.openai.com | 23 | L1 | 59 | F |
| Stripe | docs.stripe.com | 38 | L1 | 88 | B |
| Cursor | cursor.com | 23 | L1 | 75 | C |
| Resend | resend.com | 69 | L4 | 99 | A |
| Fern | buildwithfern.com | 23 | L1 | 86 | B |
| CompetLab | competlab.com | 31 | L2 Bot-Aware | not in directory | - |
What each scanner says it measures, in its own words. Cloudflare's launch blog describes isitagentready.com as "a new tool to help site owners understand how they can make their sites optimized for agents, from guiding agents on how to authenticate, to controlling what content agents can see, [and] the format they receive it in." The scan covers Discoverability, Content Accessibility, Bot Access Control, and Protocol Discovery. Fern's afdocs documentation frames Agent Score differently: it "measures how well AI coding agents can discover, navigate, and consume your docs." Different question, different answer.
Four patterns jump out of the matrix. Docs subdomains score higher than marketing roots. Companies whose business is documentation (Mintlify, Postman) score well on Fern and poorly on Cloudflare. OpenAI's developer docs underperform relative to the company's size. Two sites - Resend and docs.cloudflare.com - are Level 4 on Cloudflare, a tier Cloudflare does not publicize.
The first three patterns follow from the same cause: scanner choice matches company shape. Docs-heavy companies score well on the scanner that measures docs. Marketing-site-only companies score poorly on both scanners because they ship neither stack. The OpenAI outlier is the interesting one - a company at the center of the agent ecosystem whose developer surface ranks worst in this sample on the scanner built to measure coding-agent fitness.
Mintlify: docs-native, protocol-blind. Mintlify sells documentation software that turns marketing prose into markdown-negotiated, SSR'd, llms.txt-blessed pages. Their product exists to make content legible to coding agents. That product works: Fern gives their docs a 91. But mintlify.com itself - the marketing site - has no Content Signals in robots.txt, no API catalog at .well-known, no MCP Server Card, no Agent Skills index. Cloudflare's scanner reads the silence and returns 23. The tool-builder has not shipped the tools on themselves. Read-heavy, reach-poor.
OpenAI: an F on docs-agent-readiness. platform.openai.com - the developer documentation for the company whose models consume most "agent-ready" content - scores 59 on Fern's scanner. Grade F. Not a gotcha. OpenAI's platform docs are a single-page application, triggering Fern's rendering-strategy check; pages run long; content-start-position flags fire. The scanner built to measure docs-for-coding-agents says the largest LLM vendor's docs are not that. What a scanner measures depends on what "agent" means to it.
Resend: the rare both-sides ship. Resend scored 69 on Cloudflare. That puts them at Level 4, "Agent-Integrated" - a level name that does not appear in Cloudflare's public blog post. Fern gives them a 99. Of 127 companies in Fern's public directory, only a handful cross both scanners. Resend shipped markdown negotiation, Content Signals, an Agent Skills index, and an MCP Server Card. They did the uncommon thing: implemented both sides. The gap between scanners is not "scanners are broken." It is "most companies have not chosen which side to ship."
Cloudflare's ladder continues past Level 4 to Level 5, "Agent-Native," gated on four capability checks - MCP Server Card, OAuth Protected Resource, A2A Agent Card, API Catalog. Nobody in the matrix has passed all four.
The three dimensions of "agent-ready" - reach, read, and trust
Agent-readiness fragments into three things: whether an agent can reach your endpoints, read your content, and whether the site knows it can trust the agent hitting it. Most scanners measure one dimension. A few measure two. None cover all three. The gap is the map.
![]()
Reach
By "reach" we mean protocol-level discoverability - can an agent find and call your endpoints? Not "bytes delivered." Cloudflare's scanner weights this dimension most heavily. The checks are infrastructure: a valid robots.txt, a sitemap, Link headers, AI-bot rules in robots.txt, Content Signals declarations, Web Bot Auth, OAuth discovery at .well-known, an MCP Server Card at /.well-known/mcp/server-card.json, an Agent Skills index, an A2A Agent Card, an API Catalog per RFC 9727. An agent that speaks one of those protocols gets a structured handshake with your site. An agent that does not, sees nothing.
Who needs reach? Any site whose buyer runs agent-to-agent integrations. An MCP client consuming third-party tools. A SaaS connecting to partner APIs through OAuth-discovery flows. An automation platform that wants to register itself with a customer's agent workspace. Reach is for protocols, and protocols are what machines negotiate before humans get involved.
Read
By "read" we mean whether an agent can make sense of your content - not whether an agent can fetch the bytes. Fetching is a prerequisite. Making sense is the point. Fern's scanner weights this. The checks look at the content layer: an llms.txt file that fits in one agent fetch, markdown served when an agent requests Accept: text/markdown, pages under 50K characters, server-side rendering rather than a JavaScript-only SPA, stable URLs, content starting near the top of each page, authentication walls with public alternatives. Cloudflare's scanner includes the markdown-negotiation check too. Fern's scanner skips almost everything else Cloudflare measures. The overlap is narrow.
Who needs read? Coding agents inside IDEs. Claude Code fetching your API documentation mid-task. Cursor pulling your SDK reference. An engineer asking a copilot to generate a client against your REST spec. Read is about content that survives being parsed, summarized, and code-generated from - without the agent getting lost in a JavaScript-only shell or truncating at page three of a 400-KB blob.
Trust
By "trust" we mean whether the site knows who the agent is and whether it allows the agent to act. Reach answers "can the agent find me?" Read answers "can the agent make sense of what I serve?" Trust answers "should this agent be here at all?"
A separate set of vendors measures this dimension under different vocabulary. DataDome, after its 2026 rebrand, calls itself "your traffic control plane for humans, bots, and AI agents." HUMAN Security sells AgenticTrust as "a trust and governance layer for agentic AI" that "detect[s] and classif[ies] AI agents, verif[ies] trust level, and govern[s] how agents interact with web and mobile applications." Akamai's 2026 forecast, via Reuben Koh, frames the business case: "blocking all AI bot/agent traffic will become a competitive disadvantage."
None of these vendors call what they measure "agent-readiness." They call it agent trust, traffic management, bot authentication. But they are measuring a dimension Cloudflare's scanner does not touch and Fern's scanner cannot - whether the agent hitting your site is allowed in, and whether it has identified itself.
Trust is not a column in the matrix because no mass-market scanner measures it across sites yet. It is a dimension, not a score yet.
This piece is about how sites are measured for agent-readiness - the publish side. Whether agents actually cite you, rank you, or pick your brand when generating answers is a separate question. That market (Profound, Peec, Evertune) measures different numbers and answers a different buyer. Different post.
Three scanner types, three dimensions, one company. A score on any one dimension is not an error. It is a narrow answer to a narrow question.
What to do if you care about this
Run both scanners on your site. Expect different numbers. Pick the dimension that matches the agents your buyer cares about: reach for agent-to-agent protocol flows, read for coding agents consuming your docs, trust for any public-internet traffic you cannot identify. Then scan your docs subdomain separately. It is usually a different story.
Twelve site-readiness scanners exist today - Cloudflare and Fern plus ten others: isagentready.com (an independent five-category scanner, not Cloudflare's), SiteSpeakAI (WebMCP + llms.txt + structured data), Agentiview (three sub-scores against a 30,000-company index), Trendos (ecommerce-specific), GEOAudit (Chrome extension), LLMs.txt Checker (Apify-hosted), StartDesigns, Hunted.space, AI Site Scorer (MCP-embedded), and AEO Scanner (Cursor plugin). The trust-axis vendors - DataDome, HUMAN, Akamai - measure a fourth dimension most scanners do not touch. Factory.ai's Agent Readiness exists but scores codebases, not sites. No single product covers every dimension.
The fragmentation is not a failure. It is what happens early in a category: every vendor picks a slice, names it, ships a tool. The slices disagree because the category has not hardened yet. Pick the scanner whose question matches yours, and expect the answer to mean something narrower than the marketing implies.
One more split worth naming. Cloudflare's scanner scored cloudflare.com at 31 and docs.cloudflare.com at 53. Stripe's root scored 18 and docs.stripe.com got 38. The split compounds at the URL level - the same company is not the same surface. That is the next piece.
Methodology. Cloudflare scans run via isitagentready.com/api/scan on April 21, 2026. Fern scores pulled from buildwithfern.com/agent-score the same day. Fourteen total site scans; ten rows selected for the published table based on story diversity. Scores reproducible by running the same URLs through each scanner. Scan JSONs archived.
There is no unified view today. Not yet.
Frequently Asked Questions
What does "agent-ready" actually mean?
It depends on which scanner you ask. Cloudflare's scanner defines it as protocol adoption: can an agent discover your endpoints via /.well-known/ paths, parse your Content Signals, authenticate via OAuth. Fern defines it as content accessibility: can a coding agent fetch markdown, handle page size, navigate without a single-page application in the way. A third group of vendors (DataDome, HUMAN Security, Akamai) measures a trust dimension most scanners do not touch - whether the site knows who the agent is. There is no single definition and no unified score today.
Why do the same company's scores differ between scanners?
Because each scanner probes a different layer of the same site. Cloudflare's scanner runs 17 protocol-level checks mostly at /.well-known/ endpoints. Fern's scanner runs 22 content-quality checks on documentation pages. Mintlify is the clearest case: 91/A on Fern because their docs are markdown-negotiable and well-structured, 23/L1 on Cloudflare because their marketing site has no Content Signals, no MCP Server Card, no Agent Skills index. One company, two scanners, same verdict from each - measured on what I measure.
What are the five Cloudflare Agent Readiness levels?
The levels are gate-based, not score-based. L1 "Basic Web Presence" is the starting tier. L2 "Bot-Aware" unlocks when your robots.txt declares Content Signals. L3 "Agent-Readable" unlocks when your server negotiates markdown content. L4 "Agent-Integrated" requires passing Agent Skills index and link-header checks. L5 "Agent-Native" requires four capability checks: MCP Server Card, OAuth Protected Resource, A2A Agent Card, API Catalog. Passing a higher level's gate before a lower one does not help - the ladder enforces order.
Which scanner should I run on my site?
Run both. Expect different numbers. Use Cloudflare's scanner if agents will talk to your endpoints via MCP or OAuth - protocol-shaped integrations. Use Fern's if coding agents (Claude Code, Cursor, GitHub Copilot) will fetch your docs during development. If your concern is bot traffic you cannot identify, a trust-layer vendor (DataDome, HUMAN, Akamai) measures something neither Cloudflare nor Fern touches. Scan your root domain and your docs subdomain separately - the gap between them is instructive.
Is llms.txt worth shipping if major LLMs do not fetch it?
The defensive answer is yes: it costs two minutes, breaks nothing, and puts you on par with sites that competing agents may reference in a year. The aggressive answer is probably not yet: John Mueller (Google), the ALLMO study of 94,000 cited URLs, and Flavio Longato's server-log audit all report that major LLM vendors rarely fetch it. Do not pay a vendor to implement it. Do not expect citation uplift. A separate post goes deeper on the consumption gap.
Share this article