By SitemapFixer Team
Updated April 2026

Hreflang Canonical: The Rules for International SEO

Audit your hreflang setup freeAudit hreflang free

Hreflang and canonical are the two tags that decide whether your international SEO strategy works or quietly fails. Together they tell Google "this page exists in multiple languages, and each language version is the authoritative one for its audience." Get the relationship wrong and Google will pick one version, ignore the rest, and your localised pages disappear from regional search results. This guide covers the exact rules for combining hreflang and canonical, the three implementation methods, and the specific patterns that break international visibility. For broader context on how the two tags interact, see the hreflang and canonical relationship guide.

The Golden Rule: Canonical Must Be Self-Referencing

This single rule is responsible for more international SEO failures than any other configuration mistake: the canonical tag on every language version must point to itself, not to the default language.

Hreflang tells Google "these URLs are equivalent in meaning but different in language or region." Canonical tells Google "this URL is the authoritative version of this content." The two are compatible only when each language version asserts itself as canonical. The moment a French page sets its canonical to the English page, Google reads that as "the French page is a duplicate of the English page, do not index it independently." The hreflang annotations are then ignored because Google treats them as references between a primary URL and its duplicates.

Here is the correct combined pattern for a three-language setup:

<!-- On https://example.com/en/page -->
<link rel="canonical" href="https://example.com/en/page" />
<link rel="alternate" hreflang="en" href="https://example.com/en/page" />
<link rel="alternate" hreflang="fr" href="https://example.com/fr/page" />
<link rel="alternate" hreflang="de" href="https://example.com/de/page" />
<link rel="alternate" hreflang="x-default" href="https://example.com/en/page" />

<!-- On https://example.com/fr/page -->
<link rel="canonical" href="https://example.com/fr/page" />
<link rel="alternate" hreflang="en" href="https://example.com/en/page" />
<link rel="alternate" hreflang="fr" href="https://example.com/fr/page" />
<link rel="alternate" hreflang="de" href="https://example.com/de/page" />
<link rel="alternate" hreflang="x-default" href="https://example.com/en/page" />

<!-- On https://example.com/de/page -->
<link rel="canonical" href="https://example.com/de/page" />
<link rel="alternate" hreflang="en" href="https://example.com/en/page" />
<link rel="alternate" hreflang="fr" href="https://example.com/fr/page" />
<link rel="alternate" hreflang="de" href="https://example.com/de/page" />
<link rel="alternate" hreflang="x-default" href="https://example.com/en/page" />

Notice three things: every page's canonical points to itself; the hreflang block is identical on all three pages and includes every version (including the page itself); and x-default points to the language Google should serve when nothing else matches. This is the only configuration that makes hreflang and canonical work as intended.

Why Cross-Language Canonicals Break Hreflang

The most common broken pattern looks like this: a developer sees that /fr/page and /de/page are translations of /en/page and reasons that the English version is the "source of truth." So they set the canonical on every translated page to point back to the English URL. From a content management perspective this feels logical. From an SEO perspective it is catastrophic.

Here is the broken pattern to avoid:

<!-- WRONG: on https://example.com/fr/page -->
<link rel="canonical" href="https://example.com/en/page" />
<link rel="alternate" hreflang="en" href="https://example.com/en/page" />
<link rel="alternate" hreflang="fr" href="https://example.com/fr/page" />

<!-- WRONG: on https://example.com/de/page -->
<link rel="canonical" href="https://example.com/en/page" />
<link rel="alternate" hreflang="en" href="https://example.com/en/page" />
<link rel="alternate" hreflang="de" href="https://example.com/de/page" />

What Google does with this configuration: it reads the canonical on the French page, decides the French URL is a duplicate of the English URL, and drops the French URL from the index. The hreflang annotation pointing to the French URL becomes a pointer to a non-canonical, non-indexed page, which Google ignores. The user searching in French for your product gets the English page or, more likely, gets your competitor's French page. You translated content for nothing because the canonical undid the work.

The mental model fix: hreflang already encodes the relationship between language versions. You do not need the canonical to express "these pages are related" — hreflang does that. The canonical's job is only to assert which URL is authoritative within its own language, which is always itself.

The Three Ways to Declare Hreflang

Google accepts hreflang annotations in three places: HTML head tags, HTTP response headers, and XML sitemaps. Each method has trade-offs and you should pick exactly one — mixing them creates conflicting signals.

1. HTML head (most common). The link elements go inside the <head> of every page. This is the right method for sites with fewer than ~20 language versions and full control over page templates. The canonical lives next to the hreflang block, making the relationship explicit and easy to audit by viewing source.

2. HTTP header (for non-HTML files). When the page is a PDF, an image, or any non-HTML file you cannot put a <link> tag in, hreflang is declared via the Link response header. This is the only method that works for PDFs that need geo-targeting.

# nginx config — hreflang for a PDF served at /docs/manual.pdf
location = /docs/manual.pdf {
  add_header Link '<https://example.com/en/docs/manual.pdf>; rel="alternate"; hreflang="en", <https://example.com/fr/docs/manual.pdf>; rel="alternate"; hreflang="fr", <https://example.com/de/docs/manual.pdf>; rel="alternate"; hreflang="de", <https://example.com/en/docs/manual.pdf>; rel="alternate"; hreflang="x-default"';
}

# Apache equivalent in .htaccess
<Files "manual.pdf">
  Header add Link "<https://example.com/en/docs/manual.pdf>; rel=\"alternate\"; hreflang=\"en\""
  Header add Link "<https://example.com/fr/docs/manual.pdf>; rel=\"alternate\"; hreflang=\"fr\""
  Header add Link "<https://example.com/de/docs/manual.pdf>; rel=\"alternate\"; hreflang=\"de\""
</Files>

3. XML sitemap (best for large sites). When you have 20+ language versions or hundreds of thousands of URLs, the HTML method becomes impractical because every page needs the full hreflang block. The sitemap method centralises annotations into a single file Google reads once.

Hreflang and Canonical in an XML Sitemap

The sitemap-based hreflang declaration uses the xhtml:link namespace. Each <url> entry in the sitemap lists every language version of that URL group, including itself. Critically, the canonical relationship is implicit: the URL inside the <loc> tag is the canonical, and each language version still needs a self-referencing canonical in its HTML head.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:xhtml="http://www.w3.org/1999/xhtml">
  <url>
    <loc>https://example.com/en/page</loc>
    <xhtml:link rel="alternate" hreflang="en" href="https://example.com/en/page" />
    <xhtml:link rel="alternate" hreflang="fr" href="https://example.com/fr/page" />
    <xhtml:link rel="alternate" hreflang="de" href="https://example.com/de/page" />
    <xhtml:link rel="alternate" hreflang="x-default" href="https://example.com/en/page" />
  </url>
  <url>
    <loc>https://example.com/fr/page</loc>
    <xhtml:link rel="alternate" hreflang="en" href="https://example.com/en/page" />
    <xhtml:link rel="alternate" hreflang="fr" href="https://example.com/fr/page" />
    <xhtml:link rel="alternate" hreflang="de" href="https://example.com/de/page" />
    <xhtml:link rel="alternate" hreflang="x-default" href="https://example.com/en/page" />
  </url>
  <url>
    <loc>https://example.com/de/page</loc>
    <xhtml:link rel="alternate" hreflang="en" href="https://example.com/en/page" />
    <xhtml:link rel="alternate" hreflang="fr" href="https://example.com/fr/page" />
    <xhtml:link rel="alternate" hreflang="de" href="https://example.com/de/page" />
    <xhtml:link rel="alternate" hreflang="x-default" href="https://example.com/en/page" />
  </url>
</urlset>

Two non-obvious requirements: every URL must appear as a separate <url> entry (not just one entry with all alternates), and the hreflang block inside each entry must be identical and must include the URL itself. The sitemap does not replace the in-page canonical — your French page still needs <link rel="canonical" href="https://example.com/fr/page" /> in its HTML head. For more on sitemap-based declaration, see the hreflang sitemap guide.

The Return-Tag Requirement

Hreflang annotations must be reciprocal. If page A points to page B with hreflang="fr", page B must point back to page A with the appropriate hreflang. If the return tag is missing, Google ignores the entire annotation pair. This is the single most common reason hreflang silently fails on large sites — a translator adds a new language version but forgets to update the older language pages to reference the new one.

The return tag rule has a strict consequence: every language page must list every other language page, including itself. For a site with 10 languages, every page needs 10 hreflang entries (plus x-default). Adding an 11th language means updating all 10 existing pages, not just shipping the new one. If you cannot guarantee that update will happen atomically across all pages, switch to sitemap-based hreflang where the update is one file.

To audit return tags manually for a small set of URLs, fetch each page and check that the hreflang block is identical. To audit at scale, use a crawler that surfaces "hreflang reciprocity errors" — Screaming Frog, Sitebulb, and SitemapFixer all flag missing return tags as a discrete error class.

x-default and When to Use It

The hreflang="x-default" annotation tells Google which URL to serve when no other language or region in your cluster matches the user. It is not optional in practice — without it, users outside your targeted regions will get whichever version Google decides is most relevant, which is usually a poor match.

Two correct uses of x-default: pointing to a language selector page (a neutral landing page that lets the user choose) or pointing to your global default version, typically English. Both are valid; the choice depends on your business model. A SaaS product with users in 100+ countries usually does better with a global English x-default. A retailer with shops in 5 countries usually does better with a language selector.

What x-default is not: it is not a fallback for missing translations. If you have French and German versions but no Spanish, x-default does not magically serve French to Spanish users. It only kicks in when none of your declared hreflang values match the user's settings. To serve a sensible page to undeclared regions, you still need the actual page at the x-default URL to be useful.

Common Mistakes That Break Hreflang Canonical

Canonical pointing to default language. Already covered above, but worth repeating because it is the most damaging error. Every translated page must self-canonical. If you find your CMS is hard-coding the canonical to point to the default-language slug, that is a CMS configuration bug, not a feature.

Language code typos. Hreflang values follow the ISO 639-1 (language) and ISO 3166-1 Alpha 2 (region) standards. The most common typos: en-uk for British English (correct is en-gb — "UK" is not an ISO country code, "GB" is); en-us capitalisation (Google accepts both cases but be consistent); cn for Chinese (correct is zh for the language, optionally zh-cn for Simplified or zh-tw for Traditional); jp for Japanese (correct is ja). One typo invalidates that hreflang entry without affecting the others, so the failure is silent.

Pointing hreflang at a non-canonical URL. If your hreflang says href="https://example.com/fr/page" but that URL canonicals to /fr/page/ with a trailing slash (or to /fr/page?utm_source=... with parameters), Google sees the hreflang as pointing at a duplicate, ignores it, and may flag a hreflang error in GSC. Always make sure the URL inside hreflang exactly matches the canonical of the destination page.

Pointing hreflang at a redirected URL. If /fr/page 301 redirects to /fr/page/, do not put /fr/page in your hreflang block. Put the final URL. Hreflang to a redirect chain wastes crawl budget and is treated as ambiguous.

Missing return tags. Already covered, but watch out for the asymmetric pattern where the English page lists all other languages but the French page only lists itself and English. The German entry on the English page becomes invalid because German does not link back.

Conflicting hreflang sources. If you ship hreflang both in HTML and in the sitemap, and they ever drift out of sync, Google's behaviour is undefined. Pick one method per site and lock it down with a code review rule.

Debugging With GSC and the Rich Results Test

Google Search Console used to have a dedicated International Targeting report that surfaced hreflang errors aggregated across the site. That report was retired in 2022, and hreflang errors no longer appear as a discrete category in GSC. The replacement is a combination of three things: the URL Inspection tool, the Rich Results Test, and crawler-based auditing.

URL Inspection (per URL). In GSC, paste any of your URLs into the inspection bar. Look at the "Page indexing" section: if Google has chosen a different canonical than the one you declared, it will say "Google-selected canonical" with a different URL than your "User-declared canonical." This is the strongest signal that your canonical+hreflang setup is broken. Test the same URL in three or four languages and look for any discrepancy.

Rich Results Test (rendered HTML). Use https://search.google.com/test/rich-results with any of your localised URLs. Click "View Tested Page" → "HTML". Search the rendered HTML for hreflang and confirm: (a) the block is present, (b) it includes every language version, (c) every entry uses absolute URLs (not relative), (d) the canonical right next to it is self-referencing. If any of these fail in the rendered HTML even though they appear correct in your source, you have a JavaScript injection issue — Google's rendering pipeline did not pick up your tags.

Crawler audit (whole site). Run a crawler against your site that surfaces hreflang reciprocity errors, missing return tags, language code typos, and canonical/hreflang conflicts. SitemapFixer flags all of these as a single hreflang health score across your entire URL inventory. Per-URL debugging only catches issues you already suspect; crawler auditing catches the systemic ones you do not know about yet. For a deeper dive into specific error types, see the hreflang errors guide.

When Hreflang Cannot Fix Duplicate Content

Hreflang is only valid when each language version contains genuinely different (translated) content. If your French page contains the same English text as the English page — because translation has not happened yet, or because the page is auto-translated and Google detects it — hreflang will not protect against duplicate content penalties. Google's duplicate content systems run independently of hreflang.

Two scenarios where this commonly bites: launching a new locale before content is translated (the new pages serve the source language with a different URL prefix), and machine-translated content that Google's language detection flags as low-quality. In both cases, the right fix is to noindex the untranslated pages until real translation lands, not to ship broken hreflang. For more, see the international duplicate content guide.

A Pre-Launch Checklist

Before shipping a new language version, run through this list. Each item maps to a failure mode covered above:

1. The new page's canonical points to itself, not to the source language.

2. The new page's hreflang block lists every existing language version plus itself.

3. Every existing page has been updated to include the new language in its hreflang block (return tags).

4. The x-default value still points to the correct fallback after the new language is added.

5. All hreflang URLs are absolute, use HTTPS, and match the destination page's canonical exactly.

6. Language codes are valid ISO codes (no en-uk, cn, or jp).

7. The Rich Results Test confirms the rendered HTML includes the full hreflang block on the new page.

8. The XML sitemap (if used) has been regenerated to include the new locale and all return-tag entries.

9. URL Inspection on a sample of new URLs shows the user-declared canonical matches the Google-selected canonical.

10. The new page contains genuinely translated content, not source-language placeholder text.

If any item fails, do not ship. Hreflang errors compound across pages and take weeks to clear from Google's index once introduced.

Related Guides

Audit your hreflang and canonical setup
Free analysis in 60 seconds
Analyze My Site Free
Related guides