By SitemapFixer Team
Updated April 2026

X-Robots-Tag: The HTTP Alternative to Meta Robots

Check if noindex headers are accidentally blocking your important pagesCheck My Sitemap Free

What the X-Robots-Tag Header Is

The X-Robots-Tag is an HTTP response header that tells search engine crawlers how to index or follow a resource. It's the HTTP header equivalent of the <meta name="robots"> tag — but with one critical difference: it works on any type of file, not just HTML pages.

A meta robots tag lives inside an HTML document's <head> — which means it can only control indexing of HTML pages. PDFs, images, videos, XML files, and other non-HTML resources have no <head> section, so meta robots can't be used on them. The X-Robots-Tag header solves this: it's sent as part of the HTTP response headers and works on any file type your server serves.

Google, Bing, and most major search engines support X-Robots-Tag. Googlebot reads the response headers before processing the document body, so the directive is respected immediately — even for very large files.

X-Robots-Tag vs Meta Robots: When to Use Each

Use meta robots for HTML pages. It's simpler to implement (just add a tag to your HTML), easier to manage per-page, and works in every CMS and framework. WordPress SEO plugins (Yoast, Rank Math) use meta robots tags for per-page noindex settings.

Use X-Robots-Tag for: PDF files you don't want indexed (internal documents, order confirmations, internal reports), image files that shouldn't appear in Google Images, video files, CSV downloads, and any other non-HTML resource. Also use it when you want to apply indexing rules to entire directories of files at the server config level, without touching individual files.

The directives available in X-Robots-Tag are the same as meta robots: noindex (don't add to search index), nofollow (don't follow links in this document), nosnippet (don't show a text snippet in search results), noarchive (don't show a cached link), none (equivalent to noindex + nofollow), and all (default behavior, no restrictions).

How to Set X-Robots-Tag on Apache, Nginx, and Next.js

Apache (.htaccess): Use the FilesMatch directive to apply the header to specific file types:

# Apache .htaccess — noindex all PDF files <FilesMatch "\.pdf$"> Header set X-Robots-Tag "noindex, nofollow" </FilesMatch> # Noindex all files in the /private/ directory <Directory /var/www/html/private> Header set X-Robots-Tag "noindex" </Directory>

Nginx: Use the location block with regex matching for file extensions:

# Nginx — noindex PDF files location ~* \.pdf$ { add_header X-Robots-Tag "noindex, nofollow" always; } # Noindex files in /internal/ path location /internal/ { add_header X-Robots-Tag "noindex" always; }

Next.js: Use the headers() function in next.config.js to apply response headers based on path patterns:

// next.config.js module.exports = { async headers() { return [ { // Apply to all PDFs served from /documents/ source: '/documents/:path*.pdf', headers: [ { key: 'X-Robots-Tag', value: 'noindex, nofollow', }, ], }, { // Apply to a specific staging path source: '/preview/:path*', headers: [ { key: 'X-Robots-Tag', value: 'noindex', }, ], }, ]; }, };

How to Verify X-Robots-Tag Is Working

Use curl to inspect HTTP response headers without downloading the full file. The -I flag fetches headers only:

# Check response headers for a PDF curl -I https://yourdomain.com/files/report.pdf # Expected output includes: # HTTP/2 200 # content-type: application/pdf # x-robots-tag: noindex, nofollow # Check as Googlebot user-agent (some servers respond differently) curl -I -A "Googlebot/2.1 (+http://www.google.com/bot.html)" https://yourdomain.com/files/report.pdf

In Chrome DevTools: open the Network tab, load the URL, click the resource, and look at the Response Headers section. The x-robots-tag header should appear there if it's being set correctly by your server.

X-Robots-Tag vs robots.txt: Key Differences

Both robots.txt and X-Robots-Tag can control crawler access to your resources, but they work differently and serve different purposes:

robots.txt controls whether Googlebot is allowed to crawl (visit) a URL. If you disallow a URL in robots.txt, Google won't crawl it at all. But Google may still index the URL (show it in search results) if it discovers the URL from links — just without crawling the page to get content.

X-Robots-Tag noindex prevents Google from indexing the page, but Google must still be able to crawl the page to see the noindex directive. This is the critical distinction: a URL blocked by robots.txt cannot be noindexed via X-Robots-Tag because Googlebot never fetches the headers. If you want to remove a page from Google's index, use noindex via X-Robots-Tag (or meta robots) and ensure the URL is NOT blocked in robots.txt.

Check If Your Pages Are Being Accidentally Noindexed
Free analysis in 60 seconds
Check My Sitemap Free

Related Guides