De-indexing Pages: 4 Methods to Remove URLs From Google
De-indexing is the process of removing URLs from Google's search index so they no longer appear in search results. It sounds simple — and the methods themselves are straightforward — but the most common mistake site owners make is choosing the wrong tool for the job. The single biggest myth, which we'll bust early, is that robots.txt deindexes pages. It does not. Pages already in the index will stay there indefinitely if you only block them in robots.txt, often resurfacing as those infamous "No information is available for this page" results. This guide walks through the four legitimate ways to deindex pages, when to use each, what timeline to expect, and how to recover if you accidentally deindex something you wanted to keep.
The 4 Ways to Deindex a Page
There are exactly four supported methods to remove a URL from Google's index. Each works in different scenarios, and most real deindexing projects use a combination of two or more.
1. The noindex meta tag — the standard method for HTML pages. You add a <meta name="robots" content="noindex"> tag in the page <head>, Googlebot crawls the page, sees the directive, and removes the URL from the index on the next index refresh.
2. The X-Robots-Tag HTTP header — the noindex equivalent for non-HTML files. PDFs, images, videos, and other file types can't carry meta tags, so you set the directive in the HTTP response header instead. Same effect, different transport mechanism.
3. Google Search Console's Removal Tool — a temporary hide that suppresses URLs from search results for approximately 6 months. It is fast (effective within hours) but reversible: once the 6 months expire, if the URL still exists and is indexable, it will reappear in search.
4. HTTP status codes 410 (Gone) or 404 (Not Found) — used when you genuinely want to delete the page. A 410 signals permanent removal more strongly than 404, and Google deindexes 410 URLs faster (typically within days vs. weeks for 404).
The right method depends on the situation. Page still exists but should be hidden from search? Use noindex. Page is a PDF or image? Use X-Robots-Tag. Need urgent removal (legal, privacy, leaked draft)? Use the GSC Removal Tool plus a permanent method. Page is being deleted entirely? Use 410.
Method 1: noindex Meta Tag for HTML Pages
This is the workhorse of deindexing. For any HTML page that should remain accessible to humans but invisible in Google, the noindex meta tag is the right tool. The implementation is a single line in the document head:
<!-- Standard noindex (applies to all crawlers) --> <meta name="robots" content="noindex"> <!-- Google-specific noindex --> <meta name="googlebot" content="noindex"> <!-- Combined: deindex AND prevent following links on the page --> <meta name="robots" content="noindex, nofollow"> <!-- Allow link following but still deindex (default behavior) --> <meta name="robots" content="noindex, follow">
Critical detail: the page MUST be crawlable for this to work. If robots.txt disallows the URL, Googlebot never fetches the HTML and never sees the meta tag — so the page stays indexed. Always verify your robots.txt does not block the URL before relying on a noindex tag.
For framework-specific implementation, the canonical Next.js App Router approach is to set robots: { index: false, follow: true } in the metadata export. WordPress users can toggle "Allow search engines to show this Post in search results?" to No in Yoast SEO or Rank Math's per-post settings panel.
Method 2: X-Robots-Tag for PDFs, Images, and Non-HTML Files
You can't put a meta tag inside a PDF or a JPG. For these, Google supports the X-Robots-Tag HTTP response header, which carries the same directives but rides along with the HTTP response instead of inside the document. This is the only way to deindex non-HTML resources, and it's underused — most sites have indexed PDF leftovers from old marketing campaigns sitting in search results because no one knew how to remove them.
Implementation depends on your web server. Here are the two most common configurations:
# Apache: deindex all PDFs site-wide via .htaccess
<FilesMatch "\.pdf$">
Header set X-Robots-Tag "noindex, nofollow"
</FilesMatch>
# Apache: deindex all images
<FilesMatch "\.(jpg|jpeg|png|gif|webp)$">
Header set X-Robots-Tag "noindex"
</FilesMatch>
# nginx: deindex PDFs
location ~* \.pdf$ {
add_header X-Robots-Tag "noindex, nofollow";
}
# nginx: deindex specific path
location /private-files/ {
add_header X-Robots-Tag "noindex";
}
# Verify it's working with curl
curl -I https://example.com/whitepaper.pdf | grep -i x-robots-tag
# Expected: X-Robots-Tag: noindex, nofollowThe X-Robots-Tag works for any file type Google indexes — PDFs, DOCX, XLSX, images, videos, and even HTML if you prefer header-based control over meta tags. For application-served files (PDFs generated dynamically by Node.js, Django, Rails, etc.) you can set the header in your application code rather than the web server config.
Method 3: Google Search Console Removal Tool
The GSC Removal Tool (Indexing → Removals) is the fastest way to suppress a URL from Google search results — but it's also the most misunderstood. Two facts site owners frequently miss:
It is temporary. Removal lasts approximately 6 months. After that, if the URL still exists, returns 200 OK, and lacks a noindex directive, it will reappear in search results. The Removal Tool is not a substitute for permanent deindexing; it's a fast-acting bandage you apply while the underlying permanent fix (noindex tag, 410 status, content deletion) propagates.
It only hides from Google search results. The URL stays in Google's index, and the page itself is still publicly accessible if anyone has the link. Removal Tool is not the same as deletion.
Use case: you accidentally published a draft, a customer's personal data leaked into a public URL, or a legal request requires immediate suppression. Submit the URL via the Removal Tool first (effective within hours), then deploy the permanent fix (noindex or 410) to ensure the URL stays out of the index after the 6-month window.
The current GSC Removals API also exposes this functionality programmatically for batch operations:
# GSC Removals API: submit a removal request
# Requires OAuth 2.0 authentication with searchconsole scope
curl -X POST \
"https://searchconsole.googleapis.com/v1/sites/https%3A%2F%2Fexample.com%2F/removals" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/old-page",
"type": "URL_REMOVAL",
"urlType": "URL_AND_CACHE"
}'
# Batch submission via Python (google-api-python-client)
# from googleapiclient.discovery import build
# service = build('searchconsole', 'v1', credentials=creds)
# request = service.urlRemovals().create(
# siteUrl='https://example.com/',
# body={'url': 'https://example.com/old', 'type': 'URL_REMOVAL'}
# )
# response = request.execute()Why robots.txt Does NOT Deindex Pages (The #1 Myth)
This is the most damaging misconception in technical SEO. Adding Disallow: /private/ to your robots.txt does not remove already-indexed pages from Google. It blocks crawling, which is a different operation entirely. Here's the mechanism that breaks:
If a URL is already indexed and you then disallow it in robots.txt, Googlebot stops crawling the URL. But the URL stays in the index — Google has no way to learn anything new about it, including whether it should be deindexed. The result is a stuck listing, often with the snippet replaced by "A description for this result is not available because of this site's robots.txt" or "No information is available for this page." This can persist for years.
If a URL was never indexed but accumulates external links, Google can still index the URL based on those links alone — without crawling it. The indexed version will have no description, just the URL itself, but it will appear in search results.
The correct sequence to deindex a URL using a noindex tag:
# WRONG: this leaves the URL stuck in the index forever # robots.txt User-agent: * Disallow: /old-section/ # RIGHT: allow crawling, serve noindex on the page # robots.txt User-agent: * # (no Disallow rule for /old-section/) # On each /old-section/ page: # <meta name="robots" content="noindex"> # After Google has recrawled and deindexed all pages, # THEN you can add the Disallow rule to save crawl budget: User-agent: * Disallow: /old-section/
The order matters: noindex first, wait for recrawl and deindexing to complete (verify with site: search), then add the robots.txt block. See our deeper analysis in robots.txt vs noindex.
Deindexing Timeline: What to Expect
Deindexing is not instant. Even with the right tag in place, Google must recrawl the URL, process the directive, and refresh its index — and this happens on Google's schedule, not yours. Realistic timelines by method:
noindex meta tag (high-priority pages): 3–14 days. Pages with strong internal links, frequent updates, or significant inbound traffic get recrawled quickly. The deindexing kicks in on the first recrawl after the tag is deployed.
noindex meta tag (low-priority pages): 4–12 weeks. Long-tail pages, deep archive content, and rarely-linked URLs may sit waiting for a recrawl for months. You can speed this up by submitting the URL via GSC URL Inspection → Request Indexing — counterintuitively, this also accelerates deindexing because it triggers an immediate fetch.
X-Robots-Tag header: same timeline as meta noindex. The directive is processed identically once Google fetches the resource.
410 Gone status: 1–7 days for high-priority URLs, 2–4 weeks for low-priority. Google treats 410 as a strong signal of permanent removal and deindexes faster than for noindex tags.
404 Not Found status: 2–6 weeks. Google is more cautious with 404s because they're often transient (server errors, broken deploys). It will recrawl multiple times before deindexing to confirm the page is genuinely gone.
GSC Removal Tool: within hours. This is the only fast-acting method, and the trade-off is that it's temporary.
Permanent vs Temporary Removal
Choosing between permanent and temporary removal comes down to whether the page should ever be indexed again:
Permanent removal scenarios: retired product pages, deleted articles, expired campaign landing pages, content moved to a new URL with a 301 redirect to the new location. Use 410 (preferred) or 404 plus the GSC Removal Tool for the immediate hide while Google reprocesses.
Indexed-but-hidden scenarios: internal staff-only pages, login-required content, internal search result pages, faceted-navigation parameter URLs you don't want competing with the canonical. Use noindex meta tag or X-Robots-Tag — the page stays accessible to logged-in users, just invisible in public search.
Urgent legal/privacy scenarios: leaked customer data, accidentally-public draft, defamation requiring removal. Use GSC Removal Tool first for immediate suppression, then permanent fix on the page itself, then verify with site: search at 24h, 1 week, and 1 month.
Batch Deindexing for Large Sites
When you need to deindex thousands of pages — say, after a site migration leaves an old section orphaned, or when cleaning up a legacy CMS — manual one-at-a-time methods don't scale. The strategy:
Step 1: Categorize the URLs. Pull the full list of URLs to deindex from your sitemap, GSC Pages report (Indexed pages), or a Screaming Frog crawl. Bucket them by deindexing intent: keep but hide (noindex), permanently delete (410), redirect to new location (301).
Step 2: Bulk-deploy the chosen method. For noindex on entire URL patterns (e.g., all /tag/ URLs), use server-level X-Robots-Tag rules or a CMS template change rather than editing pages one by one. For 410 status on deleted URLs, configure your server or framework to return 410 for the URL pattern.
Step 3: Submit a sitemap of URLs to deindex. Counterintuitively, including URLs in a sitemap (even with noindex tags) speeds deindexing because it tells Google to recrawl them. After deindexing is complete, remove these URLs from the sitemap.
Step 4: Monitor progress in GSC. Watch the Indexed page count in Indexing → Pages decline week over week. If counts plateau, sample 10 URLs and inspect them — the noindex may not be deploying correctly, or robots.txt may be blocking the recrawl.
How to Verify a Page Is Deindexed
Three verification methods, ordered by reliability:
GSC URL Inspection (most reliable). Paste the URL into the inspection tool at the top of GSC. The result shows "URL is not on Google" with the reason — typically "Excluded by ‘noindex’ tag" or "Not found (404)." This is the authoritative source.
site: search operator. Run site:example.com/old-page in Google. If the URL is fully deindexed, no result will appear. If a result appears with the URL but no description, that's a partial state — the URL is in the index but Google hasn't crawled the noindex tag yet, or robots.txt is blocking the crawl.
GSC Pages report. Indexing → Pages → Not indexed shows the categorized reasons. URLs successfully deindexed via noindex appear under "Excluded by noindex tag." URLs deindexed via 410/404 appear under "Not found (404)" or "Page with redirect."
Recovering From Accidentally Deindexing Your Whole Site
This happens more than you'd expect — usually a developer pushes a staging config to production with <meta name="robots" content="noindex"> in the global template, or a WordPress "Discourage search engines from indexing this site" checkbox gets toggled. Within 1–3 weeks, organic traffic collapses.
Step 1: Find and remove the noindex source. View-source on the homepage and 2–3 representative pages. Look for any <meta name="robots"> tag with noindex. Also check HTTP response headers (curl -I https://yoursite.com) for X-Robots-Tag. In WordPress, Settings → Reading and uncheck "Discourage search engines."
Step 2: Trigger immediate recrawl on critical URLs. Use GSC URL Inspection → Request Indexing on your homepage, top 10 traffic pages, and primary category pages. This puts them in the priority recrawl queue.
Step 3: Submit/resubmit the sitemap. In GSC → Sitemaps, resubmit your sitemap.xml. Submission triggers Google to revisit listed URLs.
Step 4: Monitor recovery. Indexed page counts typically begin recovering within 7–14 days, with full recovery taking 4–8 weeks for established sites. Tools like SitemapFixer can audit your sitemap in one pass to confirm all listed URLs return 200 and lack noindex directives.
Deindexing for Legal and Privacy Reasons
Legal removal requests (GDPR, defamation, copyright, doxxing) follow a different track from standard SEO deindexing because speed matters and you may not control the offending page.
If you control the page: use the GSC Removal Tool for immediate suppression, then permanently delete the content and return 410. Document the timeline and content removal for any legal record.
If the page is on a third-party site: request removal from the site owner. If they refuse and the content violates Google's policies (personally identifiable information, doxxing, non-consensual imagery, etc.), submit a Google Removal Request via the Search → Help → Personal Information Removal form. Google may remove the URL from search results without removing the page itself.
For GDPR "right to be forgotten" requests in the EU: Google provides a separate GDPR removal form. Approval is discretionary and Google weighs public interest against privacy. Approved removals affect EU search results only — the URL remains visible in non-EU Google.
For step-by-step procedures using the GSC tool, see how to remove URLs from GSC.
Related Guides
- Noindex Directives: Complete Guide to Meta and Header Tags
- How to Remove a URL From Google Search Console
- X-Robots-Tag: HTTP Header Directives Explained
- robots.txt vs noindex: Why They Are Not the Same
- Canonical and Noindex: Can You Use Them Together?
- htaccess noindex: Block Indexing via X-Robots-Tag in Apache