Gzip/Compression Errors in Your Sitemap
When your sitemap.xml.gz is corrupted, double-compressed, or served with mismatched Content-Encoding headers, Googlebot cannot decode it and logs "Couldn't fetch" in Search Console. Every URL inside becomes invisible to search engines, even if the rest of your site is healthy.
What is this error?
A gzip/compression error occurs when a crawler requests your sitemap.xml.gz and either (a) receives data that cannot be inflated with the DEFLATE algorithm, (b) gets a file whose magic bytes (1F 8B) are missing or truncated, or (c) receives a Content-Encoding: gzip header on top of an already-compressed .gz body, triggering double-decode failure. Google Search Console typically reports "Sitemap could not be read" or "General HTTP error" for these cases.
Why does it happen?
The most common cause is misconfigured servers that apply gzip transport compression to a pre-gzipped static file. Nginx gzip_static, Apache mod_deflate, and Cloudflare auto-minify all have edge cases that can corrupt .gz sitemaps. Other causes include FTP transfers in ASCII mode (which alters line endings inside the gzip stream), CMS plugins that write incomplete .gz trailers, and build pipelines that truncate files over 10MB.
Why does it hurt SEO?
A gzip error is catastrophic: Google treats the entire sitemap as empty, so none of its URLs receive the discovery boost sitemaps provide. Fresh content can take weeks longer to get crawled, and existing pages lose the lastmod signal entirely. Sites with 10,000+ URLs often see indexed-page counts collapse within two to three weeks of a gzip break.
How to detect it
Run curl -I https://yoursite.com/sitemap.xml.gz and check the Content-Encoding header. Then run curl https://yoursite.com/sitemap.xml.gz | gunzip - if it errors with "unexpected end of file" or "not in gzip format", you have a problem. Sitemap Fixer automates this check and shows you the exact byte position where decoding fails.
How to fix it
1. Regenerate the .gz file locally with `gzip -9 sitemap.xml` and upload it in binary mode. 2. Disable server-level gzip for .gz files (Nginx: `gzip off` in a `location ~ \.gz$` block). 3. Verify the response: `Content-Type: application/x-gzip` with NO `Content-Encoding: gzip` header. 4. If your sitemap is under 10MB, drop gzip entirely - serve plain sitemap.xml instead. 5. In Search Console, resubmit the sitemap and watch for the "Success" status within 24 hours. 6. Add a weekly cron that fetches and decodes the .gz file to catch future regressions.
Real-world example
A WordPress publisher moved to Cloudflare and lost 40% of their indexed pages in three weeks. The culprit: Cloudflare's Auto Minify re-compressed their already-gzipped sitemap, producing a file with two gzip headers that Googlebot rejected. Disabling Auto Minify for .gz paths restored indexing within two weeks.
Common mistakes
- Letting the server apply Content-Encoding: gzip to an already-gzipped file
- Uploading .gz files via FTP in ASCII mode instead of binary
- Using `zip` instead of `gzip` - they produce different, incompatible formats