How to add llms.txt, JSON-LD, and AI crawler controls to Next.js
Next.js sites need the same machine-readable surface as any other modern website: llms.txt, structured data, crawler rules, and validation. The tricky part is choosing the right integration point so those artifacts reflect your final output instead of an earlier build step.
Why Next.js is slightly different
With plain Vite or Astro, the final HTML output is usually obvious. Next.js can mix static export, prerendered pages, server deployments, and fully dynamic SSR routes in the same app. That means a useful Next integration cannot just be a generic bundler plugin. It has to respect what Next actually emits at build time.
That is what @agentmarkup/next is for. It is a final-output-first adapter built around Next's config and build hooks rather than a Vite-style HTML transform.
What the Next.js adapter gives you
llms.txtgeneration from your config, with the homepage discovery link injected automatically- Optional
llms-full.txtwith inlined same-site markdown context when mirrors exist - JSON-LD injection into emitted HTML plus validation of existing schema blocks
- Optional markdown mirrors for thin or noisy built pages that need a cleaner fetch target for agents
robots.txtpatching for AI crawler directives like GPTBot, ClaudeBot, and Google-Extended- Header support for Content-Signal and markdown canonicals through static
_headersoutput or merged Next header rules - Build-time validation for schema mistakes, crawler conflicts, thin HTML, and markdown drift
Basic setup
Install the package:
pnpm add -D @agentmarkup/nextThen wrap your Next config:
// next.config.ts
import type { NextConfig } from 'next'
import { withAgentmarkup } from '@agentmarkup/next'
const nextConfig: NextConfig = {
output: 'export',
}
export default withAgentmarkup(
{
site: 'https://example.com',
name: 'Example Docs',
description: 'Technical docs and product pages.',
globalSchemas: [
{ preset: 'webSite', name: 'Example Docs', url: 'https://example.com' },
{ preset: 'organization', name: 'Example Inc.', url: 'https://example.com' },
],
llmsTxt: {
sections: [
{
title: 'Documentation',
entries: [
{
title: 'Getting Started',
url: '/docs/getting-started',
description: 'Setup guide and first steps',
},
],
},
],
},
llmsFullTxt: {
enabled: true,
},
markdownPages: {
enabled: true,
},
contentSignalHeaders: {
enabled: true,
},
aiCrawlers: {
GPTBot: 'allow',
ClaudeBot: 'allow',
PerplexityBot: 'allow',
'Google-Extended': 'allow',
CCBot: 'disallow',
},
},
nextConfig,
)The important thing to notice is that the config shape is shared across the first-party adapters. The shared AgentMarkupConfig stays framework-agnostic. Only the wrapper changes.
Where it works best
The strongest fit is static export and any route where Next emits build-time HTML that can be patched or post-processed. That includes a lot of real App Router sites: docs, marketing pages, blog pages, changelogs, and mixed apps with a meaningful prerendered surface.
On those builds, you get the full output flow:
out/
llms.txt
llms-full.txt
robots.txt
_headers
docs/getting-started/index.html
docs/getting-started.mdServer deployments are still useful too. You keep generated root artifacts and header integration, even when the deployment is not a pure static export.
The one caveat that matters
Fully dynamic SSR routes are the boundary. If Next never emits an HTML file for a route at build time, there is no final file for the adapter to patch afterward.
That does not make the package useless for Next apps. It just means you should be precise about ownership:
- Use
@agentmarkup/nextfor static export, prerendered pages, generated root artifacts, and header integration - Use the re-exported
@agentmarkup/corehelpers inside app code for truly dynamic routes that have no build-time HTML file
That is the honest model for Next. The package is strongest where Next has a final-output artifact. For routes without one, route-level core helpers are the right tool.
Should you enable markdown mirrors?
Only when they add signal. If your emitted HTML is already substantial, keep HTML as the primary fetch target. If the built page is thin, noisy, or heavily shell-like, generated markdown mirrors can give fetch-based agents a cleaner path.
agentmarkup keeps that feature disciplined by generating mirrors from final HTML, keeping them directly fetchable, and adding canonical headers back to the HTML route so search engines keep the page itself as the preferred URL.
Why this is useful for the Next.js community
A lot of Next.js teams already care about structured metadata, crawlability, and build output quality. They just do not want four separate solutions for llms.txt, JSON-LD, crawler policy, markdown mirrors, and validation.
The practical value of @agentmarkup/next is that it keeps those concerns in one build step, on the same config surface, with the same rules the public checker looks for on deployed sites.
The bottom line
If your Next.js app has a real static or prerendered surface, @agentmarkup/next is the natural adapter. It gives you build-time machine-readable output without making you stitch the pieces together manually.
Start with the adapter for the routes Next emits, keep markdown mirrors optional, and use @agentmarkup/core directly only where fully dynamic SSR makes that necessary. That is the cleanest model for shipping a machine-readable Next.js website today.
If you need the underlying pieces in more detail, read the llms.txt guide, the JSON-LD guide, and the AI crawlers guide.