Updated April 2026

Robots.txt Examples and Templates

Check your robots.txt for conflicts freeAnalyze My Site Free

Copy the template that matches your platform and customize as needed. Every robots.txt should reference your sitemap URL at the bottom and be tested with Google Search Console after any changes.

Minimal - allow everything, declare sitemap

User-agent: *
Allow: /
 
Sitemap: https://yoursite.com/sitemap.xml

The simplest correct robots.txt. Allows all crawlers and declares your sitemap location. Use this if you have no pages to block.

WordPress

User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-login.php
Disallow: /?s=
Allow: /wp-admin/admin-ajax.php
 
Sitemap: https://yoursite.com/sitemap_index.xml

Blocks WordPress admin, includes, and login pages. Allows admin-ajax.php which is needed for dynamic frontend features. Blocks search result pages. Declares the Yoast sitemap index location.

Shopify

User-agent: *
Disallow: /admin
Disallow: /cart
Disallow: /orders
Disallow: /checkouts
Disallow: /account
Disallow: /*?*sort_by
Disallow: /*?*view
 
Sitemap: https://yourstore.myshopify.com/sitemap.xml

Blocks Shopify admin, cart, checkout, and account pages. Blocks common sort and view parameters that create URL variants. Shopify generates this automatically but customizing reduces crawl waste.

Next.js / Vercel

User-agent: *
Allow: /
Disallow: /api/
Disallow: /_next/
 
Sitemap: https://yoursite.com/sitemap.xml

Blocks API routes and Next.js internal routes from crawling. All public pages remain accessible. Adjust /api/ if you have public API documentation pages you want indexed.

Block AI scrapers while keeping Google

User-agent: GPTBot
Disallow: /
 
User-agent: ChatGPT-User
Disallow: /
 
User-agent: CCBot
Disallow: /
 
User-agent: anthropic-ai
Disallow: /
 
User-agent: *
Allow: /
 
Sitemap: https://yoursite.com/sitemap.xml

Blocks known AI training crawlers while keeping Google, Bing, and other search engines. Place specific bot rules before the wildcard rule. Note: these bots are not required to respect robots.txt.

The Most Dangerous Robots.txt Rule

User-agent: *
Disallow: /

This blocks Google from crawling your entire site. It is commonly left in robots.txt after development and causes zero organic traffic. Check your live robots.txt at yoursite.com/robots.txt if you suspect this is present.

Check for robots.txt conflicts in your sitemap

Free - detects conflicts in 60 seconds

Analyze My Site Free