Invalid XML Syntax in Your Sitemap

Updated April 2026·By SitemapFixer Team

XML is strict: a single unescaped ampersand, an unclosed tag, or a stray byte-order mark at the start of the file is enough to make Googlebot reject your entire sitemap. When this happens, none of the URLs inside are processed - they're all invisible to search engines until the syntax error is fixed.

Validate your sitemap XML
We parse your sitemap the same way Googlebot does and point to the exact line of any error
Analyze My Sitemap

What is this error?

An invalid XML syntax error occurs when your sitemap.xml fails XML well-formedness rules. Common cases include: unescaped special characters in URLs (&, <, >, ", '), unclosed tags (<loc> without </loc>), missing xmlns namespace declaration, incorrect declaration order, or a UTF-8 byte-order mark (BOM) before the XML declaration. Search Console reports "Sitemap could not be read."

Why does it happen?

The most common cause is URLs containing query strings with & characters that weren't escaped to &amp;. Other causes: CMS plugins that concatenate strings instead of using a real XML library, template engines that emit unclosed tags when a field is null, file editors that save with a BOM, and copy-paste edits that break tag matching. Files written in Windows sometimes mix CRLF and LF line endings in ways that break parsers too.

Why does it hurt SEO?

It's the worst possible sitemap error: total failure. Google's XML parser stops at the first syntax error and treats the file as unreadable, meaning zero URLs in the sitemap contribute to discovery or lastmod signals. A single malformed ampersand in URL #2,847 out of 50,000 can nullify the other 49,999. Every day the error persists, fresh content accumulates without any discovery boost.

How to detect it

Run your sitemap through xmllint --noout sitemap.xml on the command line - it prints exact line and column numbers of any parse error. Google Search Console's sitemap report also shows "Sitemap could not be read" with a generic error message. Sitemap Fixer runs a full XML parse and reports the first error location plus any schema violations (missing xmlns, wrong element nesting).

How to fix it

1. Run `xmllint --noout sitemap.xml` to get the exact line number of the first error. 2. Escape all special characters: replace & with &, " with " in URL attributes. 3. Confirm the file starts with <?xml version="1.0" encoding="UTF-8"?> with no BOM before it. 4. Verify the root <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> is present. 5. Switch your generator from string concatenation to a proper XML library (lxml, xml.etree, xmlbuilder2). 6. Re-validate, upload, and resubmit the sitemap in Search Console.

Real-world example

A media site's sitemap broke after they added query-string-based article URLs like /article?id=123&author=jane. The ampersand wasn't escaped. Their sitemap with 12,000 URLs dropped from "22 URLs discovered" (their home page links) back to 22 indexed pages overall. After escaping to &amp;, coverage recovered to 11,800 within 2 weeks.

Common mistakes

Frequently Asked Questions

Why does Search Console say 'Could not read sitemap'?
Most often it's an XML parse error: an unclosed tag, an invalid character like an unescaped ampersand (&), or a missing xmlns namespace declaration. Google's XML parser stops at the first error and reports the entire sitemap as unreadable.
What characters must be escaped in sitemap XML?
Five characters need escaping inside URL values: & becomes &amp;, ' becomes &apos;, " becomes &quot;, < becomes &lt;, and > becomes &gt;. Query strings with & are the #1 cause of sitemap XML errors.
Does my sitemap need an XML declaration?
Yes. Every sitemap must start with <?xml version="1.0" encoding="UTF-8"?> on line 1 with no whitespace or BOM characters before it. Missing declarations and stray BOM bytes are the two most common top-of-file errors.
Fix this in your sitemap now
Enter your domain and get a full sitemap audit in 60 seconds
Analyze My Sitemap Free
Related sitemap errors
All sitemap errors