Updated April 2026

Google Crawlers: Complete List of User Agents

Check your sitemap is accessible to GooglebotAnalyze My Sitemap

Google operates multiple distinct web crawlers, each with its own user agent token and specific purpose. Understanding which Google bot does what matters when configuring robots.txt, diagnosing crawl issues in server logs, and verifying Googlebot activity using IP address lookups.

This guide covers every active Google crawler, its user agent string, what it crawls, and how to target or block it in robots.txt.

Complete Google Crawler Reference

Crawler Name	User Agent Token	Purpose
Googlebot	Googlebot	Main web crawl for Google Search index
Googlebot-Image	Googlebot-Image	Crawls images for Google Images
Googlebot-News	Googlebot-News	Crawls articles for Google News
Googlebot-Video	Googlebot-Video	Crawls videos for Google Video Search
Google-Extended	Google-Extended	AI training data and Gemini products
AdsBot-Google	AdsBot-Google	Crawls landing pages for Google Ads quality
AdsBot-Google-Mobile	AdsBot-Google-Mobile	Mobile version of AdsBot for landing page quality
APIs-Google	APIs-Google	Crawls for Google APIs (Structured Data Testing Tool, etc.)
Mediapartners-Google	Mediapartners-Google	AdSense crawls for ad targeting
Google-InspectionTool	Google-InspectionTool	URL Inspection tool in Search Console
Googlebot-Mobile	Googlebot (iPhone)	Mobile-first indexing crawl (part of main Googlebot)

Googlebot (Main Web Crawler)

Googlebot is Google's primary crawler for building and maintaining the web search index. It comes in two variants:

Googlebot Desktop: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Googlebot Smartphone: Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) ... (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

Since Google switched to mobile-first indexing in 2019, Googlebot primarily uses the smartphone user agent. The desktop user agent is still used but less frequently. When configuring robots.txt, a rule for User-agent: Googlebot applies to both variants.

Googlebot reads and respects your XML sitemap. Submit your sitemap in Google Search Console to ensure Googlebot discovers it, and reference it in robots.txt via Sitemap: https://yoursite.com/sitemap.xml for automatic discovery.

Google-Extended (AI Training Crawler)

Google-Extended is the user agent token for Google's AI training and Gemini product crawls. It was introduced in 2023 to give site owners a way to opt out of AI training independently of the main search index. Blocking Google-Extended will not affect your Google Search rankings — it only affects whether your content is used for Bard/Gemini AI improvement.

# Block AI training, keep search indexing
User-agent: Google-Extended
Disallow: /
# Keep Googlebot allowed
User-agent: Googlebot
Allow: /

AdsBot-Google

AdsBot-Google crawls your Google Ads landing pages to evaluate ad quality. It does not respect the wildcard User-agent: * Disallow rules. If you want to block AdsBot, you must create an explicit rule for it. This is important: many site owners assume a global Disallow applies to all bots, but AdsBot requires its own explicit directive.

# Must be explicit — AdsBot ignores User-agent: * rules
User-agent: AdsBot-Google
Disallow: /private/
User-agent: AdsBot-Google-Mobile
Disallow: /private/

Googlebot-Image, Googlebot-News, Googlebot-Video

These specialist crawlers index content for specific Google products. If you want your images to appear in Google Images but not be indexed for web search (unusual, but possible), you can Allow Googlebot-Image while Disallowing Googlebot on specific paths. Similarly, sites with a News Publisher Center account can control Googlebot-News separately from the main crawler.

A rule for User-agent: Googlebot does NOT automatically apply to Googlebot-Image or Googlebot-News. These are separate user agents and require separate robots.txt rules if you need to control them independently.

# Block all bots from /private/ including specialist crawlers
User-agent: *
Disallow: /private/
# Block image crawling on product thumbnails only
User-agent: Googlebot-Image
Disallow: /thumbnails/

Verifying Googlebot: IP Address Lookup

Because any bot can spoof a user agent string, the only reliable way to verify that a crawler is genuinely Googlebot is to perform a reverse DNS lookup on the IP address and confirm it resolves to a google.com or googlebot.com domain. Google publishes this verification process in its documentation.

Steps to verify Googlebot:

Find the IP address of the suspicious request in your server access logs
Run a reverse DNS lookup: host [IP address] — the result should show a hostname ending in googlebot.com or google.com
Run a forward DNS lookup on that hostname to confirm it resolves back to the original IP
If both steps confirm a google.com/googlebot.com hostname, the request is genuine Googlebot

Google also publishes its crawler IP ranges in a JSON file at https://developers.google.com/search/apis/ipranges/googlebot.json. You can use this for automated IP-based verification in your CDN or firewall rules.

robots.txt for Google Crawlers: Common Patterns

# Allow all Googlebot crawlers (recommended for most sites)
User-agent: Googlebot
Allow: /
# Allow images, news, video indexing
User-agent: Googlebot-Image
Allow: /
User-agent: Googlebot-News
Allow: /
# Optionally block AI training without affecting search
User-agent: Google-Extended
Disallow: /
# Always add your sitemap for automatic discovery
Sitemap: https://yoursite.com/sitemap.xml

Google Crawlers vs. Third-Party AI Crawlers

Google's crawlers are distinct from the AI crawlers operated by OpenAI (GPTBot), Anthropic (ClaudeBot), and Perplexity (PerplexityBot). Each requires its own robots.txt rules. A Disallow for Googlebot does not block GPTBot, and vice versa. See the individual guides below for each third-party AI crawler.

Check Googlebot can reach your sitemap

Free — identifies crawl and indexing issues in 60 seconds

Analyze My Sitemap Free