How to Find All Pages on a Website
Whether you are auditing your own site or researching a competitor, knowing how to find all pages on a website is a fundamental SEO skill. Here are the six most reliable methods, from fastest to most thorough.
1. Check the XML Sitemap
The fastest way to find all pages on a website is to check its XML sitemap. Most sites have one at /sitemap.xml. Open your browser and go to example.com/sitemap.xml — you will see every URL the site wants search engines to index. If the sitemap uses a sitemap index file, it will link to child sitemaps organized by section (blog, products, pages). This is the most complete and reliable method because site owners explicitly list the pages they consider important.
2. Use a Sitemap Finder Tool
Not every site has its sitemap at /sitemap.xml. Some use /sitemap_index.xml, /wp-sitemap.xml, or reference it only in robots.txt. A sitemap finder tool checks all 20+ common locations automatically. Enter any domain in SitemapFixer and we will discover the sitemap, parse every URL, and show you the complete list grouped by section — no manual searching required.
3. Google site: Operator
Type site:example.com into Google Search to see all indexed pages. This shows you what Google has found, which may differ from the sitemap. Pages that appear in the site: results but not the sitemap are discoverable through links. Pages in the sitemap but not in site: results may have indexing issues. Comparing these two lists is a powerful SEO diagnostic technique.
4. Google Search Console
If you own the site, Google Search Console provides the most accurate view. The Pages report shows every URL Google knows about, its indexing status, and any issues. The Sitemaps section shows which URLs were submitted versus indexed. This is the authoritative source — but it only works for sites you have verified ownership of.
5. Crawl the Website
Tools like Screaming Frog, Ahrefs Site Audit, or Sitebulb crawl a website by following every link from the homepage. This discovers pages that may not be in the sitemap and identifies orphan pages (pages with no internal links). Crawling is thorough but slow for large sites and requires desktop software. For a quick alternative, SitemapFixer parses your sitemap and identifies structural issues in 60 seconds.
6. Check robots.txt
The robots.txt file at example.com/robots.txt often references the sitemap URL directly. It also shows which sections of the site are blocked from crawling. If a section is disallowed in robots.txt, those pages will not appear in Google search results regardless of whether they are in the sitemap.
How to List All Pages on a Website (Exporting the URL List)
Finding the URLs is step one. The second step — listing them in a usable format — is what makes the audit actionable. A "list all pages on a website" workflow produces a single flat file of URLs you can hand to a copywriter, import into a spreadsheet, or feed into another tool. Three reliable export paths:
From the sitemap directly. Open the sitemap in a browser and view source. The URLs sit inside <loc> tags — a regex grab (/<loc>(.+?)<\/loc>/g) extracts the list. For a sitemap index that points to multiple child sitemaps, run the same grab against each child and concatenate. Command-line option: curl https://example.com/sitemap.xml | grep -oP '(?<=<loc>).+?(?=</loc>)' produces a clean newline-separated list.
From SitemapFixer. Enter the domain on the homepage and we crawl, parse, and group every URL by section automatically. The export gives you a CSV with URL, last-modified date, and cluster — no command line required.
From Search Console. Property → Indexing → Pages → Export. This gives you Google's view of the site rather than the site's self-reported view, which is the right list for SEO triage. The two lists will not match perfectly — that gap is often where the most interesting findings hide.
How to Find All Subpages of a Website (Including Hidden Ones)
"Subpages" usually means pages beneath the homepage on the same domain — anything that is not the root /. The sitemap gives you the official list. To find subpages the sitemap omits, combine three techniques:
Path-scoped site: queries. Run site:example.com/blog/ to surface only blog subpages, then site:example.com/docs/ for docs, and so on for every top-level directory you know exists. Each path-scoped query returns up to a few hundred results — far more than a single domain-wide site: query, which caps at around 300 in Google's UI.
Wayback Machine path discovery. Visit web.archive.org/web/*/example.com/* and the calendar view lists every URL the archive has ever snapshotted, organised by date. Useful for finding decommissioned pages that still attract backlinks (a content opportunity if you re-publish or 301 them).
Subdomain enumeration. A "site" in practice often includes subdomains the main sitemap ignores. Run site:*.example.com for a wildcard subdomain search, or use crt.sh to query certificate transparency logs (https://crt.sh/?q=%25.example.com) — every TLS certificate the domain ever issued is logged, which often reveals dev, staging, and forgotten subdomains.
If your goal is a complete subpage list for a site you own, the cleanest path is to combine your sitemap with a server log scan: every URL that returned HTTP 200 in the last 90 days is a subpage that exists, whether it is in the sitemap or not.
How to See All Pages of a Website Without Installing Software
Desktop crawlers like Screaming Frog are powerful but require installation, a licence above 500 URLs, and a few hours to learn. For most audits — and for the "find all pages on a website online" use case — the in-browser path is faster and free:
Step 1. Open the sitemap URL directly (example.com/sitemap.xml). If it 404s, try example.com/sitemap_index.xml, example.com/wp-sitemap.xml, and example.com/robots.txt — the robots file often references the real sitemap location.
Step 2. If you cannot locate a sitemap manually, paste the domain into SitemapFixer. It checks 20+ common sitemap paths automatically and runs through robots.txt for the canonical location. You get the URL list in your browser, no install needed.
Step 3. Cross-reference with the Google site: operator for indexed coverage. The difference between the sitemap list and the site: list is your most useful signal — sitemap URLs missing from site: are not indexed; site: URLs missing from the sitemap are orphan or undeclared content.