SEO Guide · Sitemaps

XML sitemaps

Your sitemap is a direct instruction to search engines about which URLs matter. When it's full of redirects, 404s, and noindex pages, that instruction becomes noise. Here's how to keep your sitemap clean and what the audit checks.

Why sitemap hygiene matters

A sitemap is the one place you explicitly tell search engines "these are my pages." If that list includes URLs that redirect, error, or carry noindex, you're handing crawlers a contradictory map.

The principle is simple: your sitemap should list exactly the canonical, indexable, 200-status URLs you want in search — no more, no less. The explainer below covers the full set of best practices, from what to include to sitemap-index structure, lastmod accuracy, and how to keep the sitemap in sync with your canonical URL set.

Read the Sitemaps explainer

Everything that belongs in a clean, useful sitemap — and what to keep out.

XML sitemap best practices →

What to include and exclude, the 50,000-URL limit and sitemap index, declaring it in robots.txt, accurate lastmod, and keeping the sitemap aligned with your canonical, indexable URL set.

What the audit checks in this category

Grounded in the real checks the crawler runs.

Non-canonical URLs in the sitemap · 3XX (redirect) URLs in the sitemap · sitemap too large (over the URL/size limit) · sitemap missing lastmod · future-dated lastmod · sitemap syntax errors · sitemap timeout or unreachable · indexable pages missing from the sitemap · pages that appear in multiple sitemaps · pages dropped from the sitemap since the last crawl · robots.txt failing to declare the sitemap.

Clean up your sitemap

Free to start. Find redirects, errors and non-canonical URLs hiding in your sitemap.

Start my free audit