By SitemapFixer Team
Updated April 2026

Googlebot News: Requirements and Optimization for Google News

Is your news sitemap properly configured? Check it now.Check My Sitemap Free

What Is Googlebot News?

Googlebot News is Google's specialized crawler for news content. It crawls articles from approved publishers for inclusion in Google News — the news aggregation product available at news.google.com and the News tab in Google Search. Googlebot News is separate from the main Googlebot crawler and operates on a much faster crawling schedule to index breaking news articles within minutes of publication.

Appearing in Google News is not automatic. Unlike standard organic search, where any publicly crawlable page can be indexed, Google News requires publisher approval. Your site must meet Google's content policies and technical requirements, and Google manually reviews publisher applications before granting access.

Google News is a high-traffic surface for news publishers. Articles that rank in the Top Stories carousel in Google Search or in the Google News feed can receive orders of magnitude more traffic than the same article would get through organic search alone. Understanding how Googlebot News works is essential for any news publisher trying to capture that traffic.

Googlebot News User Agent String

The Googlebot News user agent is distinct and simple:

Googlebot-News

This is the token you use in robots.txt to control Googlebot News's access to your content. Unlike the main Googlebot, Googlebot News does not include a full browser user agent string — it identifies itself simply as Googlebot-News.

One important nuance: if you use the wildcard User-agent: * rule in robots.txt with a Disallow, Googlebot News respects that rule. If you have a specific User-agent: Googlebot rule with an Allow, that does not automatically apply to Googlebot News — you need to either use a wildcard allow or explicitly allow Googlebot-News separately.

Google News Publisher Requirements

Google News has both content policy requirements and technical requirements. On the content side, Google requires that publishers produce original reporting, have clear authorship and contact information, maintain editorial transparency (disclosing ownership and funding), and publish content that complies with Google's general content policies (no hate speech, no dangerous content, no spam).

Technical requirements include:

  • Each article must have a unique, permanent URL that does not change after publication
  • Article pages must be crawlable by Googlebot News (not blocked in robots.txt)
  • Pages must have a clear HTML <title> that matches the article headline
  • Article dates and author bylines must be machine-readable (structured data helps but is not strictly required)
  • The site must be accessible without login or paywall for the crawled content (paywalled content can appear in Google News with specific markup)
  • A news sitemap is strongly recommended and significantly accelerates article indexing

Google no longer uses a formal publisher application portal as of 2023. Instead, sites that meet the content and technical requirements are eligible to appear in Google News automatically, and Google's algorithms determine which sites to include. However, you can verify your publication status through Google Search Console's News report.

News Sitemaps: The news:news Extension

A news sitemap is an XML sitemap that uses the Google News extension namespace to provide article-specific metadata. It is the primary mechanism for telling Googlebot News about newly published articles quickly — often resulting in indexing within minutes of publication.

A news sitemap should only include articles published within the last 48 hours. Google ignores older articles in news sitemaps (they should be in your regular sitemap instead). The format:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:news="http://www.google.com/schemas/sitemap-news/0.9">
  <url>
    <loc>https://example.com/news/2026/04/27/breaking-story</loc>
    <news:news>
      <news:publication>
        <news:name>Example News</news:name>
        <news:language>en</news:language>
      </news:publication>
      <news:publication_date>2026-04-27T14:30:00+00:00</news:publication_date>
      <news:title>Breaking: Major Story Develops in City Center</news:title>
    </news:news>
  </url>
</urlset>

Key points about news sitemaps: the news:title must match the actual article title on the page. The news:publication_date should use W3C Datetime format including timezone. You should include at most 1,000 URLs in a news sitemap (covering the last 48 hours). Submit the news sitemap URL in Google Search Console and ping Google when it updates: https://www.google.com/ping?sitemap=YOUR_SITEMAP_URL

robots.txt and Googlebot News

You can control Googlebot News access independently from the main Googlebot using the Googlebot-News token:

# Allow Googlebot News to crawl all news articles
User-agent: Googlebot-News
Allow: /news/
Allow: /articles/

# Block Googlebot News from opinion and sponsored content
# (if you don't want these appearing in Google News)
User-agent: Googlebot-News
Disallow: /opinion/
Disallow: /sponsored/

# Main Googlebot still crawls everything
User-agent: Googlebot
Allow: /

Selectively blocking sections from Googlebot News is a useful editorial tool. If your site publishes both hard news and opinion or branded content, you may want to prevent the latter from appearing in Google News, which has stricter content expectations around journalistic integrity and labeling.

How Googlebot News Differs from Googlebot

Googlebot News has several key differences from the main Googlebot that publishers need to understand:

DimensionGooglebotGooglebot News
Crawl frequencyBased on crawl budget (days to weeks for lower-priority pages)Near real-time (minutes to hours for approved publishers)
Content focusAll page typesNews articles only (recent content)
DestinationGoogle Search indexGoogle News index + Top Stories
Publisher approvalNot requiredRequired (content policy review)
Sitemap formatStandard XML sitemapNews sitemap with news: namespace

Common Reasons Google News Rejects Publisher Applications

Based on Google's published policies and common publisher experiences, these are the most frequent reasons sites do not qualify for Google News:

  • Lack of original reporting: Sites that primarily aggregate or republish content from other outlets without adding original reporting or analysis are not eligible. Google News requires original journalism.
  • No clear authorship or contact information: Articles must have bylines. The publication must have a visible "About" page, ownership disclosure, and contact information. Anonymous publishing is not acceptable.
  • Undisclosed sponsored content: If sponsored or advertiser-funded articles are not clearly labeled as such, this violates Google News content policies. Advertorials must be marked as advertising.
  • Technical issues preventing crawling: If Googlebot News cannot crawl your articles — due to robots.txt blocks, login requirements, or crawl errors — the site cannot be evaluated for inclusion.
  • Content policy violations: Hate speech, medical misinformation, election misinformation, violent content, or content that violates Google's general content policies will result in rejection or removal from Google News.
  • Thin or low-quality content: Articles that are very short, lack substantive reporting, or appear to be generated for SEO rather than informational purposes will not qualify.
Is Your News Sitemap Ready for Googlebot News?
Free sitemap analysis in 60 seconds
Check My Sitemap Free

Related Guides