Agent-Ready: Why the Same Site Scores 23 or 91

Two popular agent-readiness scanners will score the same company 68 points apart. Fern gives Mintlify's docs a 91. Cloudflare's scanner gives mintlify.com a 23. Same site. Two tools. Two verdicts. It is happening everywhere - across fourteen other companies scanned for this piece, across documentation subdomains and marketing roots, across companies whose whole product is making sites legible to AI agents.

The pattern is not a bug. Different scanners measure different things, and nobody writing "is your site agent-ready?" tells you that up front. This piece is the map.

Why do two agent-readiness scanners give the same company a 68-point gap?

They measure different things. Cloudflare's scanner rewards protocol adoption - MCP Server Card, OAuth discovery, Content Signals. Fern's rewards content accessibility - markdown negotiation, page size, rendering strategy. Mintlify passes one and skips the other. Same site, two scanners, 68 points apart. It is happening everywhere.

Mintlify sells documentation software built for AI agents. Their docs pass Fern's Agent Score at 91/A. Their own marketing site scores 23 on isitagentready.com - Level 1, "Basic Web Presence." A tool-builder failing a different tool-builder's scanner. One company, two verdicts, 68-point gap.

On April 18, Dachary Carey - author of the Agent-Friendly Documentation Spec that powers Fern's scanner - named the pattern. "They're not measuring the same thing," she wrote, "and the gaps can be dramatic." She gave one example: her own site, 100/100 on Fern, 33 on Cloudflare. She walked through which Fern checks passed, which Cloudflare checks failed, and why - a side-by-side of the same URL through two definitions of "ready."

Her map uses three dimensions - content accessibility, protocol adoption, agent experience. This piece frames it slightly differently: reach, read, trust. Her "content accessibility" is our "read." Her "protocol adoption" is our "reach." Her third dimension - "agent experience" - asks whether the agent succeeds at its task, which is a measurement no site scanner can run. This piece stays on what scanners measure: the publish side. Trust, in our frame, is named by a different set of vendors entirely, and no mass-market scanner measures it yet.

The pattern holds across fourteen more site scans run for this piece. Most score well on one scanner and poorly on the other. A handful score well on both. Almost none are past Level 3 on Cloudflare's ladder unless their documentation lives on a separate subdomain that was built with agents in mind. That last pattern matters: documentation hostnames - docs.X.com, developers.X.com, platform.X.com - are usually built by teams that think about agent consumers. Marketing roots are usually not. Same company, different surface, different audience.

What is each scanner actually measuring?

Cloudflare's scanner probes the protocol stack - .well-known endpoints, MCP cards, OAuth discovery, Content Signals. Fern's probes the content layer - llms.txt, markdown negotiation, rendering strategy, page size. Both call themselves "agent-ready." Neither names the split. The matrix below shows what that disagreement looks like across fourteen sites.

Company	Scanned URL	CF Score	Level	Fern Score	Fern Grade
Cloudflare	cloudflare.com	31	L1 Basic Web Presence	not in directory	-
Cloudflare	docs.cloudflare.com	53	L4 Agent-Integrated	97	A
Mintlify	mintlify.com	23	L1	91	A
Postman	postman.com	23	L1	96	A
OpenAI	platform.openai.com	23	L1	59	F
Stripe	docs.stripe.com	38	L1	88	B
Cursor	cursor.com	23	L1	75	C
Resend	resend.com	69	L4	99	A
Fern	buildwithfern.com	23	L1	86	B
CompetLab	competlab.com	31	L2 AI-Aware (Bot-Aware on Cloudflare)	not in directory	-

What each scanner says it measures, in its own words. Cloudflare's launch blog describes isitagentready.com as "a new tool to help site owners understand how they can make their sites optimized for agents, from guiding agents on how to authenticate, to controlling what content agents can see, [and] the format they receive it in." The scan covers Discoverability, Content Accessibility, Bot Access Control, and Protocol Discovery. Fern's afdocs documentation frames Agent Score differently: it "measures how well AI coding agents can discover, navigate, and consume your docs." Different question, different answer.

Both scanners leave four checks unmeasured that fall in the gap between them: llms.txt validation (not just presence), llms.txt size, HTTP hygiene, and redirect behavior. Our own scan adds those four. The gap is the start of a third map.

Four patterns jump out of the matrix. Docs subdomains score higher than marketing roots. Companies whose business is documentation (Mintlify, Postman) score well on Fern and poorly on Cloudflare. OpenAI's developer docs underperform relative to the company's size. Two sites - Resend and docs.cloudflare.com - are Level 4 on Cloudflare, a tier Cloudflare does not publicize.

The first three patterns follow from the same cause: scanner choice matches company shape. Docs-heavy companies score well on the scanner that measures docs. Marketing-site-only companies score poorly on both scanners because they ship neither stack. The OpenAI outlier is the interesting one - a company at the center of the agent ecosystem whose developer surface ranks worst in this sample on the scanner built to measure coding-agent fitness.

Mintlify: docs-native, protocol-blind. Mintlify sells documentation software that turns marketing prose into markdown-negotiated, SSR'd, llms.txt-blessed pages. Their product exists to make content legible to coding agents. That product works: Fern gives their docs a 91. But mintlify.com itself - the marketing site - has no Content Signals in robots.txt, no API catalog at .well-known, no MCP Server Card, no Agent Skills index. Cloudflare's scanner reads the silence and returns 23. The tool-builder has not shipped the tools on themselves. Read-heavy, reach-poor.

OpenAI: an F on docs-agent-readiness. platform.openai.com - the developer documentation for the company whose models consume most "agent-ready" content - scores 59 on Fern's scanner. Grade F. Not a gotcha. OpenAI's platform docs are a single-page application, triggering Fern's rendering-strategy check; pages run long; content-start-position flags fire. The scanner built to measure docs-for-coding-agents says the largest LLM vendor's docs are not that. What a scanner measures depends on what "agent" means to it.

Resend: the rare both-sides ship. Resend scored 69 on Cloudflare. That puts them at Level 4, "Agent-Integrated" - a level name that does not appear in Cloudflare's public blog post. Fern gives them a 99. Of 127 companies in Fern's public directory, only a handful cross both scanners. Resend shipped markdown negotiation, Content Signals, an Agent Skills index, and an MCP Server Card. They did the uncommon thing: implemented both sides. The gap between scanners is not "scanners are broken." It is "most companies have not chosen which side to ship."

Cloudflare's ladder continues past Level 4 to Level 5, "Agent-Native," gated on four capability checks - MCP Server Card, OAuth Protected Resource, A2A Agent Card, API Catalog. Nobody in the matrix has passed all four.

The three dimensions of "agent-ready" - reach, read, and trust

Agent-readiness fragments into three things: whether an agent can reach your endpoints, read your content, and whether the site knows it can trust the agent hitting it. Most scanners measure one dimension. A few measure two. None cover all three. The gap is the map.

Reach

By "reach" we mean protocol-level discoverability - can an agent find and call your endpoints? Not "bytes delivered." Cloudflare's scanner weights this dimension most heavily. The checks are infrastructure: a valid robots.txt, a sitemap, Link headers, AI-bot rules in robots.txt, Content Signals declarations, Web Bot Auth, OAuth discovery at .well-known, an MCP Server Card at /.well-known/mcp/server-card.json, an Agent Skills index, an A2A Agent Card, an API Catalog per RFC 9727. An agent that speaks one of those protocols gets a structured handshake with your site. An agent that does not, sees nothing.

Who needs reach? Any site whose buyer runs agent-to-agent integrations. An MCP client consuming third-party tools. A SaaS connecting to partner APIs through OAuth-discovery flows. An automation platform that wants to register itself with a customer's agent workspace. Reach is for protocols, and protocols are what machines negotiate before humans get involved.

Read

By "read" we mean whether an agent can make sense of your content - not whether an agent can fetch the bytes. Fetching is a prerequisite. Making sense is the point. Fern's scanner weights this. The checks look at the content layer: an llms.txt file that fits in one agent fetch, markdown served when an agent requests Accept: text/markdown, pages under 50K characters, server-side rendering rather than a JavaScript-only SPA, stable URLs, content starting near the top of each page, authentication walls with public alternatives. Cloudflare's scanner includes the markdown-negotiation check too. Fern's scanner skips almost everything else Cloudflare measures. The overlap is narrow.

Who needs read? Coding agents inside IDEs. Claude Code fetching your API documentation mid-task. Cursor pulling your SDK reference. An engineer asking a copilot to generate a client against your REST spec. Read is about content that survives being parsed, summarized, and code-generated from - without the agent getting lost in a JavaScript-only shell or truncating at page three of a 400-KB blob.

Trust

By "trust" we mean whether the site knows who the agent is and whether it allows the agent to act. Reach answers "can the agent find me?" Read answers "can the agent make sense of what I serve?" Trust answers "should this agent be here at all?"

A separate set of vendors measures this dimension under different vocabulary. DataDome, after its 2026 rebrand, calls itself "your traffic control plane for humans, bots, and AI agents." HUMAN Security sells AgenticTrust as "a trust and governance layer for agentic AI" that "detect[s] and classif[ies] AI agents, verif[ies] trust level, and govern[s] how agents interact with web and mobile applications." Akamai's 2026 forecast, via Reuben Koh, frames the business case: "blocking all AI bot/agent traffic will become a competitive disadvantage."

None of these vendors call what they measure "agent-readiness." They call it agent trust, traffic management, bot authentication. But they are measuring a dimension Cloudflare's scanner does not touch and Fern's scanner cannot - whether the agent hitting your site is allowed in, and whether it has identified itself.

Trust is not a column in the matrix because no mass-market scanner measures it across sites yet. It is a dimension, not a score yet.

This piece is about how sites are measured for agent-readiness - the publish side. Whether agents actually cite you, rank you, or pick your brand when generating answers is a separate question. That market (Profound, Peec, Evertune) measures different numbers and answers a different buyer. Different post.

Three scanner types, three dimensions, one company. A score on any one dimension is not an error. It is a narrow answer to a narrow question.

What to do if you care about this

Run both scanners on your site. Expect different numbers. Pick the dimension that matches the agents your buyer cares about: reach for agent-to-agent protocol flows, read for coding agents consuming your docs, trust for any public-internet traffic you cannot identify. Then scan your docs subdomain separately. It is usually a different story.

Twelve site-readiness scanners exist today - Cloudflare and Fern plus ten others: isagentready.com (an independent five-category scanner, not Cloudflare's), SiteSpeakAI (WebMCP + llms.txt + structured data), Agentiview (three sub-scores against a 30,000-company index), Trendos (ecommerce-specific), GEOAudit (Chrome extension), LLMs.txt Checker (Apify-hosted), StartDesigns, Hunted.space, AI Site Scorer (MCP-embedded), and AEO Scanner (Cursor plugin). The trust-axis vendors - DataDome, HUMAN, Akamai - measure a fourth dimension most scanners do not touch. Factory.ai's Agent Readiness exists but scores codebases, not sites. No single product covers every dimension.

The fragmentation is not a failure. It is what happens early in a category: every vendor picks a slice, names it, ships a tool. The slices disagree because the category has not hardened yet. Pick the scanner whose question matches yours, and expect the answer to mean something narrower than the marketing implies. Scanner fragmentation is the measurement-layer expression of a broader problem — the strategic implication of scanner fragmentation: per-LLM tuning is the only honest play because aggregating across LLMs cancels the signal extends the same logic up to LLM-output measurement, where aggregate scores dissolve the same way.

One more split worth naming. Cloudflare's scanner scored cloudflare.com at 31 and docs.cloudflare.com at 53. Stripe's root scored 18 and docs.stripe.com got 38. The split compounds at the URL level - the same company is not the same surface. That is the next piece.

Methodology. Cloudflare scans run via isitagentready.com/api/scan on April 21, 2026. Fern scores pulled from buildwithfern.com/agent-score the same day. Fourteen total site scans; ten rows selected for the published table based on story diversity. Scores reproducible by running the same URLs through each scanner. Scan JSONs archived. Level labels in the matrix follow the open Agent-Adoption Specification v1 for L1-L3; L4 entries use Cloudflare's ladder, which has no v1 equivalent.

There is no unified view today. We are building one. Scan your own site at competlab.com/tools/agent-adoption-check.