What is crawlability in SEO?

Crawlability is whether search engine bots can reach and read your pages. It depends on your robots.txt rules, your internal link structure (a page with no links to it can't be discovered), your server responding reliably, and the absence of crawl traps like infinite redirect chains. If a page isn't crawlable, it can't be indexed, so crawlability is the very first thing to get right.

What is crawl budget and does it matter for my site?

Crawl budget is the number of URLs a search engine is willing to crawl on your site in a given period. For small sites (a few thousand URLs) it rarely matters — Google crawls everything easily. It matters for large sites, sites with lots of low-value URLs (faceted navigation, parameters), or sites that waste crawls on redirects and 404s. On those, wasted crawl budget means important pages get crawled and refreshed less often.

Can robots.txt remove a page from Google?

No — and this is a common and costly mistake. Blocking a URL in robots.txt stops Google from crawling it, but it does not remove it from the index, and it actually prevents Google from seeing a noindex tag you add later (because it can't crawl the page to read the tag). To remove a page from search, allow it to be crawled and add a noindex tag, or use the removal tools. Use robots.txt to manage crawling, not indexing.

SEO Guide · Crawlability

Crawlability

Before a page can rank, a search engine has to reach it. Crawlability covers robots.txt, crawl budget, and the structural issues that stop bots from discovering your content. Get this wrong and the rest of your SEO never gets a chance.

Audit crawlability free Technical SEO Checker

Why crawlability is the foundation

Crawling, then indexing, then ranking. They happen in that order, and each depends on the one before it. A page that can't be crawled can't be indexed, and a page that isn't indexed can't rank.

Most crawlability problems come from one of three places: a robots.txt rule that blocks more than intended, a structure where important pages have no internal links pointing to them, or crawl traps (redirect chains, infinite parameter URLs) that waste a bot's time. The explainers below cover the two areas you most need to understand: how robots.txt actually behaves, and when crawl budget is worth worrying about.

Read the Crawlability explainers

The two things people most often get wrong about how crawlers reach pages.

robots.txt for SEO →

How robots.txt works, the difference between blocking crawling and blocking indexing, and the over-broad Disallow rules that accidentally hide whole sections from search.

Crawl budget explained →

What crawl budget is, when it actually matters (and when it doesn't), and how redirects, parameters, and low-value URLs quietly burn it on large sites.

What the audit checks in this category

Grounded in the real checks the crawler runs.

robots.txt missing · robots.txt blocks the sitemap · robots.txt blocks indexable pages · robots.txt doesn't declare a sitemap · pages blocked from crawling · redirect chains and loops that trap crawlers · orphan pages with no path for discovery · deep pages more than four clicks from the homepage. Crawlability overlaps with the Links, Redirects and Sitemaps categories, because discovery depends on all three.

Run your free crawl audit Sitemap Validator

Make sure search engines can reach every page

Free to start. Find blocked pages, crawl traps and discovery gaps across your site.

Start my free audit

Crawlability

Why crawlability is the foundation

Read the Crawlability explainers

robots.txt for SEO →

Crawl budget explained →

What the audit checks in this category

Make sure search engines can reach every page

We Value Your Privacy

Cookie Preferences

Essential Cookies

Analytics & Performance Cookies

Advertising & Marketing Cookies