X-Robots-Tag: The HTTP Alternative to Meta Robots
What the X-Robots-Tag Header Is
The X-Robots-Tag is an HTTP response header that tells search engine crawlers how to index or follow a resource. It's the HTTP header equivalent of the <meta name="robots"> tag — but with one critical difference: it works on any type of file, not just HTML pages.
A meta robots tag lives inside an HTML document's <head> — which means it can only control indexing of HTML pages. PDFs, images, videos, XML files, and other non-HTML resources have no <head> section, so meta robots can't be used on them. The X-Robots-Tag header solves this: it's sent as part of the HTTP response headers and works on any file type your server serves.
Google, Bing, and most major search engines support X-Robots-Tag. Googlebot reads the response headers before processing the document body, so the directive is respected immediately — even for very large files.
X-Robots-Tag vs Meta Robots: When to Use Each
Use meta robots for HTML pages. It's simpler to implement (just add a tag to your HTML), easier to manage per-page, and works in every CMS and framework. WordPress SEO plugins (Yoast, Rank Math) use meta robots tags for per-page noindex settings.
Use X-Robots-Tag for: PDF files you don't want indexed (internal documents, order confirmations, internal reports), image files that shouldn't appear in Google Images, video files, CSV downloads, and any other non-HTML resource. Also use it when you want to apply indexing rules to entire directories of files at the server config level, without touching individual files.
The directives available in X-Robots-Tag are the same as meta robots: noindex (don't add to search index), nofollow (don't follow links in this document), nosnippet (don't show a text snippet in search results), noarchive (don't show a cached link), none (equivalent to noindex + nofollow), and all (default behavior, no restrictions).
How to Set X-Robots-Tag on Apache, Nginx, and Next.js
Apache (.htaccess): Use the FilesMatch directive to apply the header to specific file types:
Nginx: Use the location block with regex matching for file extensions:
Next.js: Use the headers() function in next.config.js to apply response headers based on path patterns:
How to Verify X-Robots-Tag Is Working
Use curl to inspect HTTP response headers without downloading the full file. The -I flag fetches headers only:
In Chrome DevTools: open the Network tab, load the URL, click the resource, and look at the Response Headers section. The x-robots-tag header should appear there if it's being set correctly by your server.
X-Robots-Tag vs robots.txt: Key Differences
Both robots.txt and X-Robots-Tag can control crawler access to your resources, but they work differently and serve different purposes:
robots.txt controls whether Googlebot is allowed to crawl (visit) a URL. If you disallow a URL in robots.txt, Google won't crawl it at all. But Google may still index the URL (show it in search results) if it discovers the URL from links — just without crawling the page to get content.
X-Robots-Tag noindex prevents Google from indexing the page, but Google must still be able to crawl the page to see the noindex directive. This is the critical distinction: a URL blocked by robots.txt cannot be noindexed via X-Robots-Tag because Googlebot never fetches the headers. If you want to remove a page from Google's index, use noindex via X-Robots-Tag (or meta robots) and ensure the URL is NOT blocked in robots.txt.
Related Guides
- Noindex Tag: How to Prevent Pages from Being Indexed
- Robots.txt Guide: Block Crawlers Without Hurting SEO
- Canonical Tags: How to Fix Duplicate Content Issues
- Crawled - Currently Not Indexed: Causes and Fixes
- GSC Errors Hub: Fix All Google Search Console Issues
- Robots Noarchive: When to Block Google's Cached Page
- Canonical Noindex: Why You Should Never Combine Them
- Noindex Directives: Complete Guide to Indexing Control
- Noindex Nofollow: Combined Robots Meta Directive
- How to Remove a URL from Google Search Console
- De-Indexing Pages: How to Remove Content from Google
- .htaccess Noindex: Server-Level Index Control