By SitemapFixer Team
Updated April 2026

AJAX Crawling: How Google Crawls JavaScript and SPA Content

Test if Google can render your JS pagesScan my site free

If your site loads content with AJAX, fetch(), XHR, or any client-side JavaScript, the question of how Google crawls it is rarely simple. The answer changed three times in the past decade — from the original AJAX crawling scheme, to its deprecation, to today's evergreen Chromium-based rendering — and most advice on the web reflects an older version of the truth. This guide walks through how Googlebot actually handles AJAX-loaded content in 2026, where rendering breaks, and what to do for SPAs, infinite scroll, and JS-injected metadata.

A Brief History: Why AJAX Crawling Used to Be a Special Thing

In 2009, Google introduced the AJAX crawling scheme. Sites with hash-bang URLs like example.com/#!product/123 could provide a parallel ?_escaped_fragment_=product/123 URL that returned a static HTML snapshot of the AJAX-rendered page. Googlebot would fetch the snapshot, index it, and display the original hash-bang URL in search results. This was the canonical way to make AJAX content indexable for nearly six years.

Google deprecated the scheme in October 2015 once Googlebot became capable of executing JavaScript directly, and fully retired support in the second quarter of 2018. If you still see _escaped_fragment_ infrastructure in your codebase — proxy rules, snapshot generators, hash-bang routing — it is dead code. Remove it, migrate URLs to the History API (pushState) so they look like normal paths (/product/123), and let Googlebot render the JavaScript directly.

The shift matters because the mental model changed. There is no longer a special "AJAX crawling" pipeline. There is just regular crawling, with an extra rendering step. Understanding that step is what this guide is about.

How Googlebot Actually Renders JavaScript Today

Googlebot uses an evergreen rendering engine built on Chromium. As of 2026 this tracks Chromium 121 and later, kept within a few weeks of the public stable release. That means modern JS features — ES2023 syntax, native ES modules, dynamic import, fetch, IntersectionObserver, Web Components — all work the way they do in the latest Chrome. You do not need to transpile to ES5 for Googlebot anymore.

The core architecture is two-pass indexing:

Pass 1 — HTML crawl. Googlebot fetches the raw HTML response from your server. It extracts links, basic metadata, and any content present in the initial HTML. The URL is queued for rendering and a tentative entry is added to the index based on what is in the raw HTML.

Pass 2 — Render. A headless Chromium instance fetches the HTML, executes JavaScript, waits for the page to settle, and produces a final rendered DOM. Google indexes the rendered HTML, replacing the placeholder from pass 1.

The render queue is the catch. It is not synchronous. Google has stated the median delay is around 5 seconds but the long tail stretches to days for low-priority pages or high-load periods. Anything that depends on AJAX content for indexability — titles, descriptions, product data, internal links — will be invisible to Google until pass 2 completes.

What an AJAX-Heavy Page Looks Like to Googlebot

To make this concrete, here is a typical client-side-rendered page. The HTML response is essentially empty, and an AJAX call populates the content:

<!-- HTML response from server (what Pass 1 sees) -->
<!DOCTYPE html>
<html>
  <head>
    <title>Loading…</title>
  </head>
  <body>
    <div id="app">Loading…</div>
    <script src="/static/app.js"></script>
  </body>
</html>

<!-- After JS runs (what Pass 2 produces) -->
<script>
  fetch('/api/product/123')
    .then(r => r.json())
    .then(data => {
      document.title = data.name + ' | Acme Store';
      document.getElementById('app').innerHTML = `
        <h1>${data.name}</h1>
        <p>${data.description}</p>
        <span class="price">$${data.price}</span>
      `;
    });
</script>

On pass 1, Google sees a page titled "Loading…" with no real content. If pass 2 is delayed by 48 hours, this page sits in the index as a near-empty result for two days. Multiply that across 10,000 product pages and you have a serious indexing problem — even though the rendering will eventually succeed.

SSR vs CSR vs SSG vs ISR: The Decision Tree

Most JS-heavy sites end up with one of four rendering strategies. Here is when to use each:

Static Site Generation (SSG). Best for content that does not change per-request. Marketing pages, blog posts, documentation. The page is built once at deploy time, served as static HTML. Googlebot pass 1 sees the full content. Use Next.js generateStaticParams, Astro, Hugo, or Gatsby. This is the gold standard for SEO when it fits.

Incremental Static Regeneration (ISR). SSG with periodic rebuilds. A page is statically rendered, served from cache, and regenerated in the background after a configurable interval (e.g., every hour). Use for catalogs and listings that change too often for full SSG but not on every request. Next.js revalidate covers this.

Server-Side Rendering (SSR). Best for personalized or always-fresh content where the request itself drives the data — search results, dashboards, dynamic feeds. The server runs the JS framework on every request and returns full HTML. Googlebot pass 1 sees the rendered content. Slower than SSG, but no render-queue dependency.

Client-Side Rendering (CSR). Default React/Vue without an SSR framework. The HTML is empty, JS does everything. Acceptable for app-like surfaces behind a login (search engines do not need to index them) but a poor choice for any public, indexable page.

The decision tree: can this page be pre-built? If yes, SSG or ISR. Does it need per-request data? If yes and SEO matters, SSR. Is it gated behind auth? CSR is fine. There is almost never a good reason to use pure CSR for an indexable page.

Next.js Patterns for SEO-Safe AJAX

Next.js is the most common framework for this problem, so here are the two canonical patterns. App Router (Next.js 13+) — async server components fetch data during render, no AJAX required from the browser:

// app/products/[id]/page.tsx — App Router server component
import type { Metadata } from 'next';

export async function generateMetadata(
  { params }: { params: { id: string } }
): Promise<Metadata> {
  const product = await fetch(
    `https://api.example.com/products/${params.id}`,
    { next: { revalidate: 3600 } }
  ).then(r => r.json());
  return {
    title: `${product.name} | Acme Store`,
    description: product.description.slice(0, 155),
    alternates: { canonical: `https://example.com/products/${params.id}` },
  };
}

export default async function ProductPage(
  { params }: { params: { id: string } }
) {
  const product = await fetch(
    `https://api.example.com/products/${params.id}`,
    { next: { revalidate: 3600 } }
  ).then(r => r.json());

  return (
    <article>
      <h1>{product.name}</h1>
      <p>{product.description}</p>
      <span className="price">${product.price}</span>
    </article>
  );
}

Pages Router with getServerSideProps for legacy codebases:

// pages/products/[id].tsx — Pages Router
import type { GetServerSideProps } from 'next';
import Head from 'next/head';

export const getServerSideProps: GetServerSideProps = async (ctx) => {
  const id = ctx.params?.id as string;
  const res = await fetch(`https://api.example.com/products/${id}`);
  if (!res.ok) return { notFound: true };
  const product = await res.json();
  return { props: { product } };
};

export default function ProductPage({ product }) {
  return (
    <>
      <Head>
        <title>{product.name} | Acme Store</title>
        <meta name="description" content={product.description.slice(0, 155)} />
        <link rel="canonical"
          href={`https://example.com/products/${product.id}`} />
      </Head>
      <article>
        <h1>{product.name}</h1>
        <p>{product.description}</p>
      </article>
    </>
  );
}

In both patterns, the data fetch happens on the server. The HTML response that Googlebot pass 1 sees already contains the title, description, and content. There is no dependency on JS execution for indexing.

Dynamic Rendering: The Middle Path

If you cannot migrate to SSR or SSG — your codebase is too large, or it is a true SPA with no server-rendering capability — dynamic rendering is the documented workaround. Detect Googlebot (and other bots) by user agent, and serve them a pre-rendered HTML snapshot via a service like Prerender.io, Rendertron, or a custom Puppeteer worker. Real users continue to get the SPA.

Google explicitly permits this pattern as long as the rendered content matches what users see — it is not cloaking. Here is a minimal nginx config for prerender.io:

# /etc/nginx/sites-available/example.conf
server {
  listen 443 ssl http2;
  server_name example.com;

  location / {
    # Detect known crawlers
    if ($http_user_agent ~* "googlebot|bingbot|yandex|baiduspider|duckduckbot|slurp|facebookexternalhit|twitterbot|linkedinbot") {
      set $prerender 1;
    }
    # Skip prerender for static assets
    if ($uri ~* "\.(js|css|png|jpg|gif|svg|woff2?|ico|map)$") {
      set $prerender 0;
    }
    if ($prerender = 1) {
      rewrite ^(.*)$ /prerender last;
    }
    try_files $uri /index.html;
  }

  location /prerender {
    proxy_set_header X-Prerender-Token "YOUR_PRERENDER_TOKEN";
    proxy_set_header X-Prerender-User-Agent $http_user_agent;
    proxy_pass https://service.prerender.io/https://example.com$request_uri;
  }
}

Dynamic rendering should be a stepping stone, not a destination. Google has signalled it is a workaround, not a recommendation. Plan to migrate to SSR or SSG within 12–18 months of adopting it. Maintaining two rendering paths is technical debt that grows over time.

JSON-LD via JS Injection: When It Works and When It Does Not

Google has confirmed it picks up JSON-LD structured data injected by JavaScript during pass 2 rendering. That works in practice for most sites — but with the same caveat as all JS-rendered content: pass 1 will not have it, and rich-result eligibility may lag by days.

// Client-side JSON-LD injection (works, but not ideal for SEO)
function injectProductSchema(product) {
  const schema = {
    '@context': 'https://schema.org',
    '@type': 'Product',
    name: product.name,
    description: product.description,
    sku: product.sku,
    offers: {
      '@type': 'Offer',
      price: product.price,
      priceCurrency: 'USD',
      availability: product.inStock
        ? 'https://schema.org/InStock'
        : 'https://schema.org/OutOfStock',
    },
  };
  const script = document.createElement('script');
  script.type = 'application/ld+json';
  script.text = JSON.stringify(schema);
  document.head.appendChild(script);
}

// Server-side rendering is more reliable. In Next.js App Router:
export default async function ProductPage({ params }) {
  const product = await fetchProduct(params.id);
  const schema = { /* ...same shape as above... */ };
  return (
    <>
      <script
        type="application/ld+json"
        dangerouslySetInnerHTML={{ __html: JSON.stringify(schema) }}
      />
      <h1>{product.name}</h1>
    </>
  );
}

Render structured data server-side wherever possible. The latency cost of waiting for pass 2 is most visible exactly when it hurts most — losing rich-result placement during a launch window or sale.

Common Pitfalls in AJAX Crawling

Content behind onclick handlers. Googlebot does not click. If a "Read more" or tab UI hides content behind a click event without rendering it in the DOM first, that content is invisible. Render all important content in the initial DOM and use CSS for show/hide, not conditional rendering.

Infinite scroll without paginated URLs. Pages loaded by scrolling past a sentinel element are not crawled — Googlebot does not scroll. Provide paginated URL alternatives (?page=2, ?page=3) that load the same content server-side. The IntersectionObserver can drive the user-facing infinite scroll while pagination links handle the crawler.

Lazy-loaded images without proper attributes. Native loading="lazy" works fine. JS-driven lazy loaders that swap a placeholder for the real src only after scroll often do not — Googlebot may index the placeholder. Use the loading attribute or ensure your loader fires on render, not on scroll.

Robots.txt blocking JS or CSS files. Common mistake: a legacy Disallow: /static/ or Disallow: /assets/ rule blocks Googlebot from fetching the JS bundles needed for rendering. Pass 2 fails silently. Verify your robots.txt with the GSC robots.txt tester and explicitly allow .js, .css, and any API endpoints used for rendering.

JS errors that abort rendering. An uncaught exception during render can leave the DOM in a half-built state. Googlebot indexes whatever is there. Wrap data-fetching in proper error boundaries, use the GSC URL Inspection "View tested page" feature to inspect render errors, and watch for "page resources couldn't be loaded" warnings.

Soft 404s from empty AJAX responses. If your SPA renders "Product not found" in the body but returns HTTP 200, Google flags these as soft 404s. Return a real 404 status code from the server when an entity does not exist, even on a JS route.

Mobile-First Rendering and What It Means for AJAX

Google has been mobile-first by default since 2019 and exclusively mobile-indexed since 2023. The smartphone Googlebot user agent is what crawls and renders your pages. For AJAX-heavy sites, this matters because mobile rendering has stricter resource constraints — slower simulated network, lower CPU budget, and a tighter timeout window. A page that renders in 4 seconds on desktop may time out on mobile rendering.

Practical implications: keep JS bundles small (under 200KB gzipped per route is a good target), avoid blocking third-party scripts, and lazy-load anything not needed for first paint. If your largest content is gated behind 3+ chained AJAX calls, expect rendering issues at scale.

Google's Mobile-Friendly Test (now folded into the Lighthouse and URL Inspection tools) is still a quick way to verify a page renders correctly under mobile constraints. Use it on a representative sample of your AJAX-loaded URLs after every major frontend deploy.

Verifying AJAX Indexing With GSC URL Inspection

The single most useful diagnostic tool for AJAX crawling is the GSC URL Inspection "Test Live URL" → "View Tested Page" → "Rendered HTML" flow. This shows you the exact DOM Googlebot generated after rendering, alongside any JavaScript console errors and resource-load failures.

What to check:

Does the rendered HTML contain your real content? Search the rendered output for unique strings from your AJAX-loaded content. If they are missing, rendering failed or did not complete.

Are there blocked resources? The "More info" tab lists resources that Googlebot could not fetch — JS files, API endpoints, fonts, images. Each blocked resource is a potential rendering failure.

Are there JavaScript console errors? The console output shows errors that fired during rendering. A single unhandled exception can stop the script that populates your content.

Compare live test to indexed version. If "Test Live URL" works but "View Crawled Page" shows empty content, your last successful render was bad and Google has not retried yet. Submit for indexing once the underlying issue is fixed.

Run this check on at least one URL from every templated section of your site after each significant frontend deploy. Most AJAX rendering regressions are caught here, weeks before they show up in performance data.

Practical Checklist for AJAX-Heavy Sites

If your site loads important content via AJAX, work through this list:

1. Audit the raw HTML response. View source on a representative URL. Is the title, meta description, primary heading, and main content present in the HTML? If not, you have a CSR problem to solve.

2. Pick the right rendering strategy per route. Marketing and content pages → SSG/ISR. Product and category pages → SSR or ISR. App-like authenticated screens → CSR is fine. Do not apply one strategy to the whole site uniformly.

3. Verify robots.txt does not block JS, CSS, or API endpoints. Use the GSC robots.txt tester. Explicitly allow /api/ if your renders depend on it.

4. Inspect rendered HTML in GSC. Confirm the content Google sees matches the content users see. Repeat after every major deploy.

5. Add fallback pagination URLs for infinite scroll. Crawlers need addressable, deep-linkable pagination even if users never see it.

6. Return real HTTP status codes. 404 when an entity is missing. 410 when permanently gone. 500 when broken. Do not let your SPA mask errors with a 200 response.

7. Run a sitemap audit. Make sure every URL in your sitemap returns 200, has unique server-rendered metadata, and is not stuck in a "discovered, currently not indexed" state. SitemapFixer flags these systemic issues across thousands of URLs in a single pass.

Related Guides

Find AJAX rendering issues across your site
Free analysis in 60 seconds
Analyze My Site Free
Related guides