Updated April 2026

X-Robots-Tag: The HTTP Alternative to Meta Robots

Check if noindex headers are accidentally blocking your important pagesCheck My Sitemap Free

What the X-Robots-Tag Header Is

The X-Robots-Tag is an HTTP response header that tells search engine crawlers how to index or follow a resource. It's the HTTP header equivalent of the <meta name="robots"> tag — but with one critical difference: it works on any type of file, not just HTML pages.

A meta robots tag lives inside an HTML document's <head> — which means it can only control indexing of HTML pages. PDFs, images, videos, XML files, and other non-HTML resources have no <head> section, so meta robots can't be used on them. The X-Robots-Tag header solves this: it's sent as part of the HTTP response headers and works on any file type your server serves.

Google, Bing, and most major search engines support X-Robots-Tag. Googlebot reads the response headers before processing the document body, so the directive is respected immediately — even for very large files.

X-Robots-Tag vs Meta Robots: When to Use Each

Use meta robots for HTML pages. It's simpler to implement (just add a tag to your HTML), easier to manage per-page, and works in every CMS and framework. WordPress SEO plugins (Yoast, Rank Math) use meta robots tags for per-page noindex settings.

Use X-Robots-Tag for: PDF files you don't want indexed (internal documents, order confirmations, internal reports), image files that shouldn't appear in Google Images, video files, CSV downloads, and any other non-HTML resource. Also use it when you want to apply indexing rules to entire directories of files at the server config level, without touching individual files.

The directives available in X-Robots-Tag are the same as meta robots: noindex (don't add to search index), nofollow (don't follow links in this document), nosnippet (don't show a text snippet in search results), noarchive (don't show a cached link), none (equivalent to noindex + nofollow), and all (default behavior, no restrictions).

How to Set X-Robots-Tag on Apache, Nginx, and Next.js

Apache (.htaccess): Use the FilesMatch directive to apply the header to specific file types:

# Apache .htaccess — noindex all PDF files
<FilesMatch "\.pdf$">
  Header set X-Robots-Tag "noindex, nofollow"
</FilesMatch>

# Noindex all files in the /private/ directory
<Directory /var/www/html/private>
  Header set X-Robots-Tag "noindex"
</Directory>

Nginx: Use the location block with regex matching for file extensions:

# Nginx — noindex PDF files
location ~* \.pdf$ {
  add_header X-Robots-Tag "noindex, nofollow" always;
}

# Noindex files in /internal/ path
location /internal/ {
  add_header X-Robots-Tag "noindex" always;
}

Next.js: Use the headers() function in next.config.js to apply response headers based on path patterns:

// next.config.js
module.exports = {
  async headers() {
    return [
      {
        // Apply to all PDFs served from /documents/
        source: '/documents/:path*.pdf',
        headers: [
          {
            key: 'X-Robots-Tag',
            value: 'noindex, nofollow',
          },
        ],
      },
      {
        // Apply to a specific staging path
        source: '/preview/:path*',
        headers: [
          {
            key: 'X-Robots-Tag',
            value: 'noindex',
          },
        ],
      },
    ];
  },
};

How to Verify X-Robots-Tag Is Working

Use curl to inspect HTTP response headers without downloading the full file. The -I flag fetches headers only:

# Check response headers for a PDF
curl -I https://yourdomain.com/files/report.pdf

# Expected output includes:
# HTTP/2 200
# content-type: application/pdf
# x-robots-tag: noindex, nofollow

# Check as Googlebot user-agent (some servers respond differently)
curl -I -A "Googlebot/2.1 (+http://www.google.com/bot.html)" https://yourdomain.com/files/report.pdf

In Chrome DevTools: open the Network tab, load the URL, click the resource, and look at the Response Headers section. The x-robots-tag header should appear there if it's being set correctly by your server.

X-Robots-Tag vs robots.txt: Key Differences

Both robots.txt and X-Robots-Tag can control crawler access to your resources, but they work differently and serve different purposes:

robots.txt controls whether Googlebot is allowed to crawl (visit) a URL. If you disallow a URL in robots.txt, Google won't crawl it at all. But Google may still index the URL (show it in search results) if it discovers the URL from links — just without crawling the page to get content.

X-Robots-Tag noindex prevents Google from indexing the page, but Google must still be able to crawl the page to see the noindex directive. This is the critical distinction: a URL blocked by robots.txt cannot be noindexed via X-Robots-Tag because Googlebot never fetches the headers. If you want to remove a page from Google's index, use noindex via X-Robots-Tag (or meta robots) and ensure the URL is NOT blocked in robots.txt.

Check If Your Pages Are Being Accidentally Noindexed

Free analysis in 60 seconds

Check My Sitemap Free