ClaudeBot: Anthropic's Three-Bot Crawling Framework
Anthropic operates three distinct web crawlers, each with a different purpose and a different robots.txt user agent token. Most site owners only know about ClaudeBot, but blocking only ClaudeBot while leaving the other two bots unaddressed means your control over Anthropic's access to your content is incomplete.
This guide explains all three bots, what they do, how to identify them in your logs, and how to configure robots.txt for each.
Anthropic's Three Crawlers
| Bot Name | User Agent Token | Purpose |
|---|---|---|
| ClaudeBot | ClaudeBot | Training data collection for Claude AI models |
| Claude-User | Claude-User | Real-time web fetching during Claude.ai conversations |
| Claude-SearchBot | Claude-SearchBot | Web search index for Claude search features |
ClaudeBot: Training Data Crawler
ClaudeBot is Anthropic's primary training data crawler. Its full user agent string is: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ClaudeBot/1.0; +claudebot@anthropic.com
ClaudeBot crawls publicly available web pages to collect training data that is used to train and improve future Claude models. Anthropic states that ClaudeBot respects robots.txt and will not crawl pages disallowed by a Disallow: / directive under User-agent: ClaudeBot.
ClaudeBot is the bot most similar to GPTBot in behavior and purpose. It is primarily discovered in server logs during off-peak hours, crawling at moderate rates. If you want to prevent Anthropic from using your content for training, blocking ClaudeBot is the primary action to take.
Claude-User: Live Browsing Crawler
Claude-User activates when a Claude.ai user asks Claude to fetch a URL or browse the web during a conversation. Unlike ClaudeBot, which crawls proactively for training, Claude-User only accesses pages on-demand when a user explicitly requests it.
The practical implication: if you block Claude-User, Claude.ai will not be able to summarize or analyze your pages when users link to them in conversations. This may reduce your site's visibility and utility in the Claude ecosystem. However, if your content is sensitive or paywalled, blocking Claude-User prevents it from being surfaced in AI-generated summaries without authorization.
Claude-User respects robots.txt. A Disallow directive for Claude-User will be honored when Claude tries to fetch a page on a user's behalf.
Claude-SearchBot: Search Index Crawler
Claude-SearchBot is used for building and maintaining Anthropic's web search index, which powers Claude's search capabilities. This is the Anthropic equivalent of OAI-SearchBot for OpenAI. If Claude has web search features that surface current information, Claude-SearchBot is what crawls and indexes pages for that purpose.
Blocking Claude-SearchBot means your content may not appear in Claude's search results when users ask Claude to search the web. If you want visibility in Claude search, allow Claude-SearchBot while potentially blocking ClaudeBot for training.
How to Block Anthropic Crawlers in robots.txt
You can block each bot individually or all three together:
Block all Anthropic crawlers:
Block training only, allow live search and browsing:
Granular path-level control:
Identifying Anthropic Bots in Server Logs
All three bots are identifiable in your access logs. Look for these patterns:
- ClaudeBot: Contains
ClaudeBot/1.0andclaudebot@anthropic.com - Claude-User: Contains
Claude-Userin the user agent string - Claude-SearchBot: Contains
Claude-SearchBotin the user agent string
Anthropic publishes its crawler IP ranges for verification purposes. You can cross-check log entries against published Anthropic IP ranges to confirm you are seeing genuine Anthropic bots and not spoofed traffic. Spoofed bot traffic using AI company user agents is a real phenomenon — IP verification is the only reliable confirmation method.
ClaudeBot vs. GPTBot: Key Differences
| Factor | ClaudeBot | GPTBot |
|---|---|---|
| Operator | Anthropic | OpenAI |
| Companion bots | Claude-User, Claude-SearchBot | ChatGPT-User, OAI-SearchBot |
| robots.txt compliance | Yes | Yes |
| IP verification available | Yes | Yes |
| Crawl rate | Moderate, respects crawl-delay | Moderate, respects crawl-delay |
The main structural difference is that Anthropic has a cleaner separation of its three bots — each with a distinct user agent token — while OpenAI's bots have evolved over time and the naming is less consistent. This makes Anthropic's crawlers easier to manage in robots.txt because you can target each use case individually with precision.