Fix Your Sitemap for Squarespace
Squarespace publishes a sitemap automatically at /sitemap.xml and gives you almost no direct control. That forces you to fix issues through page settings, URL slugs, and limited robots.txt overrides.
Squarespace is the most closed of the major site builders. You can't edit sitemap.xml, can't edit robots.txt through a setting (though there are workarounds), can't exclude individual blog tags. What you have is per-page SEO toggles and URL slug control. That's actually enough to fix most issues - you just have to know which levers to pull.
Cleaned up a Squarespace portfolio site with 120 real pages and 840 URLs in the sitemap. The padding came from blog tag pages (one per tag ever used), a Not Linked folder full of old landing pages someone had forgotten about, and a .squarespace.com preview URL indexed as a mirror. After the fixes below, it dropped to 145 URLs.
Common Squarespace Sitemap Issues
- Not Linked pages appearing in sitemap because they're still public, just hidden from nav
- Blog tag and category archives competing with the main blog for keywords
- Product variant URLs on Commerce stores when internally linked with clean slugs
- System pages (
/cart,/checkout,/account) occasionally slipping in .squarespace.compreview URL indexable alongside the custom domain- No
lastmodgranularity - Squarespace uses publish date, not latest edit - Hidden pages ticked "Hide from search engine results" but still in sitemap (toggle applies noindex to the page but Squarespace still lists it)
- Legacy events (Events collection) staying in the sitemap after the event has passed
What most tutorials get wrong
Every Squarespace SEO guide says "use the Hide from search engines toggle". That toggle adds a noindex meta tag but does not remove the page from sitemap.xml. Which means Google still crawls the URL, sees the noindex, and reports it as "Excluded by noindex tag" in GSC. If you truly want a page gone, unpublish it or delete it. The noindex toggle is for pages you want crawlable (e.g., gated content at a known URL) but not in search results.
Robots.txt via Code Injection
Squarespace doesn't let you edit robots.txt directly, but you can influence crawling by blocking specific patterns through the hosting-level robots.txt that Squarespace serves. The only supported customization is through the Squarespace Developer Platform (for 7.0 sites) or by submitting a request to Squarespace support. For most users, the practical approach is: use per-page noindex for anything that shouldn't rank, and rely on Squarespace's default robots rules (which block /config, /commerce admin, etc.).
Blog tags and categories
Every tag you apply to a blog post creates a tag archive page at /blog/tag/tag-name, and these all go into the sitemap. Small sites end up with hundreds of thin tag pages. There's no bulk-exclude option, so the fix is either: use fewer tags (stick to 5-10 broad ones), or open each tag page individually in Pages and toggle "Hide from search engine results". Tedious but it works.
Step-by-Step Fix Guide
- Open Pages > Not Linked and delete or unpublish legacy pages you don't want indexed
- For pages you want crawlable but not indexable, toggle "Hide from search engine results" in Page Settings > SEO
- Consolidate blog tags down to 5-10; hide-from-search any remaining thin tag archives
- Connect your custom domain and set it to primary under Settings > Domains; the
.squarespace.comURL should redirect - On Commerce stores, verify product URLs resolve to canonical parent, not variant slugs
- Rename blog post slugs to short, keyword-rich paths before publishing (Squarespace keeps old slugs via 301)
- For past events, delete or unpublish once they're no longer relevant
- Verify with
curl https://yourdomain.com/sitemap.xml - Submit the sitemap to Google Search Console under the custom-domain property