See your website the way AI crawlers do
Most tools that grade a website fetch it once, as a browser, and score the HTML. But the systems that increasingly decide whether your brand shows up in an answer, ChatGPT, Claude, Perplexity, Google's AI surfaces, do not arrive as your browser. They arrive as GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, and Google-Extended, and they can get a very different response. @agentmarkup/audit shows you that response.
The blind spot
A page can look perfect in your browser and still be a poor citation target for AI. The homepage might be an empty JavaScript shell that only fills in after a framework hydrates, so a crawler that does not run JavaScript sees nothing. A CDN or WAF rule might treat a crawler user-agent differently than a browser. There might be no llms.txt, or a malformed one. The JSON-LD that powers rich results and AI summaries might be missing or broken. None of that is visible from a single browser fetch.
What the audit does
It fetches your URL once as a normal browser to establish a baseline, then again as each major AI crawler, and diffs the responses. On top of that it checks the machine-readable surface: robots.txt intent, Content-Signal, llms.txt, JSON-LD, and whether the raw HTML is actually readable without JavaScript.
# Audit any live URL as the major AI crawlers
npx @agentmarkup/audit https://example.com
# JSON for CI or comparisons
npx @agentmarkup/audit https://example.com --jsonA run reads like this:
✓ OpenAI gptbot can reach the page
✓ Anthropic claudebot can reach the page
✓ Content is present without JavaScript
⚠ llms.txt is missing
✓ robots.txt does not block the expected AI crawlers
✓ JSON-LD structured data present
9/10 checks passedHonest by design
Here is the part that makes the audit trustworthy rather than alarmist. It spoofs a crawler's user-agent from an ordinary IP, which is not what the real, verified bot does. So a 403 for a spoofed GPTBot user-agent is genuinely ambiguous: it could be a user-agent WAF rule that also blocks the real GPTBot, or it could be IP allowlisting where the verified GPTBot is let through just fine. The audit cannot tell those apart from a spoofed request, so it reports them as warnings with both explanations and the raw evidence, never as a bare "your site blocks AI" error.
Error-level findings, the ones that fail CI, are reserved for things provable from the response itself: a robots.txt that literally disallows the crawler, an empty JavaScript shell, or invalid llms.txt / JSON-LD. That is why the exit code is safe to gate a build on.
Where it fits
The agentmarkup adapters and the CLI generate machine-readable output at build time. @agentmarkup/audit verifies what a deployed site actually serves to AI crawlers. It is the command-line sibling of the hosted website checker: the checker is the quick browser lookup, the audit is the scriptable, CI-friendly version.
Read the audit guide for the full check list, then use the llms.txt, JSON-LD, and AI crawlers guides to fix whatever it surfaces.