By Sebastian Cochinescu · March 23, 2026 · 8 min read

How to add llms.txt, JSON-LD, and AI crawler controls to Next.js

Next.js sites need the same machine-readable surface as any other modern website: llms.txt, structured data, crawler rules, and validation. The tricky part is choosing the right integration point so those artifacts reflect your final output instead of an earlier build step.

Why Next.js is slightly different

With plain Vite or Astro, the final HTML output is usually obvious. Next.js can mix static export, prerendered pages, server deployments, and fully dynamic SSR routes in the same app. That means a useful Next integration cannot just be a generic bundler plugin. It has to respect what Next actually emits at build time.

That is what @agentmarkup/next is for. It is a final-output-first adapter built around Next's config and build hooks rather than a Vite-style HTML transform.

What the Next.js adapter gives you

llms.txt generation from your config, with the homepage discovery link injected automatically
Optional llms-full.txt with inlined same-site markdown context when mirrors exist
JSON-LD injection into emitted HTML plus validation of existing schema blocks
Optional markdown mirrors for thin or noisy built pages that need a cleaner fetch target for agents
robots.txt patching for AI crawler directives like GPTBot, ClaudeBot, and Google-Extended
Header support for Content-Signal and markdown canonicals through static _headers output or merged Next header rules
Build-time validation for schema mistakes, crawler conflicts, thin HTML, and markdown drift

Basic setup

Install the package:

1pnpm add -D @agentmarkup/next

Then wrap your Next config:

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950// next.config.ts
import type { NextConfig } from 'next'
import { withAgentmarkup } from '@agentmarkup/next'

const nextConfig: NextConfig = {
  output: 'export',
}

export default withAgentmarkup(
  {
    site: 'https://example.com',
    name: 'Example Docs',
    description: 'Technical docs and product pages.',
    globalSchemas: [
      { preset: 'webSite', name: 'Example Docs', url: 'https://example.com' },
      { preset: 'organization', name: 'Example Inc.', url: 'https://example.com' },
    ],
    llmsTxt: {
      sections: [
        {
          title: 'Documentation',
          entries: [
            {
              title: 'Getting Started',
              url: '/docs/getting-started',
              description: 'Setup guide and first steps',
            },
          ],
        },
      ],
    },
    llmsFullTxt: {
      enabled: true,
    },
    markdownPages: {
      enabled: true,
    },
    contentSignalHeaders: {
      enabled: true,
    },
    aiCrawlers: {
      GPTBot: 'allow',
      ClaudeBot: 'allow',
      PerplexityBot: 'allow',
      'Google-Extended': 'allow',
      CCBot: 'disallow',
    },
  },
  nextConfig,
)

The important thing to notice is that the config shape is shared across the first-party adapters. The shared AgentMarkupConfig stays framework-agnostic. Only the wrapper changes.

Where it works best

The strongest fit is static export and any route where Next emits build-time HTML that can be patched or post-processed. That includes a lot of real App Router sites: docs, marketing pages, blog pages, changelogs, and mixed apps with a meaningful prerendered surface.

On those builds, you get the full output flow:

1234567out/
  llms.txt
  llms-full.txt
  robots.txt
  _headers
  docs/getting-started/index.html
  docs/getting-started.md

Server deployments are still useful too. You keep generated root artifacts and header integration, even when the deployment is not a pure static export.

The one caveat that matters

Fully dynamic SSR routes are the boundary. If Next never emits an HTML file for a route at build time, there is no final file for the adapter to patch afterward.

That does not make the package useless for Next apps. It just means you should be precise about ownership:

Use @agentmarkup/next for static export, prerendered pages, generated root artifacts, and header integration
Use the re-exported @agentmarkup/core helpers inside app code for truly dynamic routes that have no build-time HTML file

That is the honest model for Next. The package is strongest where Next has a final-output artifact. For routes without one, route-level core helpers are the right tool.

Should you enable markdown mirrors?

Only when they add signal. If your emitted HTML is already substantial, keep HTML as the primary fetch target. If the built page is thin, noisy, or heavily shell-like, generated markdown mirrors can give fetch-based agents a cleaner path.

agentmarkup keeps that feature disciplined by generating mirrors from final HTML, keeping them directly fetchable, and adding canonical headers back to the HTML route so search engines keep the page itself as the preferred URL.

Why this is useful for the Next.js community

A lot of Next.js teams already care about structured metadata, crawlability, and build output quality. They just do not want four separate solutions for llms.txt, JSON-LD, crawler policy, markdown mirrors, and validation.

The practical value of @agentmarkup/next is that it keeps those concerns in one build step, on the same config surface, with the same rules the public checker looks for on deployed sites.

The bottom line

If your Next.js app has a real static or prerendered surface, @agentmarkup/next is the natural adapter. It gives you build-time machine-readable output without making you stitch the pieces together manually.

Start with the adapter for the routes Next emits, keep markdown mirrors optional, and use @agentmarkup/core directly only where fully dynamic SSR makes that necessary. That is the cleanest model for shipping a machine-readable Next.js website today.

If you need the underlying pieces in more detail, read the llms.txt guide, the JSON-LD guide, and the AI crawlers guide.

How to add llms.txt, JSON-LD, and AI crawler controls to Next.js

Why Next.js is slightly different

What the Next.js adapter gives you

Basic setup

Where it works best

The one caveat that matters

Should you enable markdown mirrors?

Why this is useful for the Next.js community

The bottom line

Make your website machine-readable

More from the blog

When markdown mirrors help, and when they do not

Is your website ready for AI? Free LLM discoverability checker

Build-time markdown mirrors for agent readability: Cloudflare comparison

How to make your brand appear in AI conversations

Why LLM-optimized e-commerce websites sell more

Every AI crawler indexing your website in 2026

JSON-LD structured data: the complete guide for web developers

What is GEO? Generative Engine Optimization explained for developers

Why llms.txt matters: making your website discoverable by AI