SERP.tools

Bulk Indexability Checker — Free Page Indexability Tool

A page can exist on your site and still be completely invisible in Google and Bing — blocked by any one of five indexability signals. Paste up to 50 URLs to find out exactly which pages are blocked and why.

Checks five signals at once: HTTP status, robots.txt, meta robots, X-Robots-Tag, and canonical — plus automatic conflict detection.

Paste URLs above · Checks Googlebot + Bingbot indexability

What This Tool Checks — 5 Indexability Signals

Every signal is evaluated in sequence. A page can pass four checks and fail the fifth — which is exactly why a unified tool is more useful than running three separate checkers.

1
HTTP Status Code
The server must return a 2xx response. A 4xx (404 Not Found, 403 Forbidden) or 5xx (500 Server Error) means the page cannot be crawled regardless of any other directive. A 301 redirect is followed — the tool reports the final destination's status.
2
robots.txt
The robots.txt file is fetched once per domain and parsed separately for Googlebot and Bingbot tokens. A Disallow rule blocks the crawler at the network level before it reads any HTML — so a noindex tag on a robots.txt-blocked page is effectively invisible.
3
<meta name="robots"> / <meta name="googlebot">
The HTML <head> is parsed for both the general robots meta tag and engine-specific variants (googlebot, bingbot). A noindex or none directive in either tag prevents the page from being indexed. Engine-specific tags override the general tag for that engine.
4
X-Robots-Tag HTTP header
This HTTP response header is checked before the HTML body and takes precedence over the meta tag. A server can return X-Robots-Tag: noindex on every response regardless of what the HTML contains — easy to miss, very common in CDN or CMS misconfiguration.
5
rel=canonical
The canonical link is detected from both the HTML <head> and the HTTP Link header. A self-canonical (canonical = current URL) is ideal. A cross-page canonical signals to search engines that this URL is a duplicate and the canonical target should be indexed instead.

Conflict Detection — The Hidden Indexability Problems

These scenarios are invisible when tools check signals separately. SERP.tools flags them automatically.

🔴

robots.txt blocks + noindex tag present

The crawler cannot read the noindex tag because robots.txt blocks access. The page may still appear in Google as a URL-only result (no snippet) because Google knows the URL exists from links — the exact opposite of what was intended.

⚠️

noindex + cross-page canonical

Conflicting instructions: noindex says 'remove me', canonical says 'I belong to that other page'. In rare cases Google may transfer the noindex signal to the canonical target, penalising the page you actually want indexed.

⚠️

X-Robots-Tag noindex overrides meta robots index

The HTTP header is evaluated before the HTML body. If the header says noindex but the meta tag says index, the page is effectively noindexed — a common result of CDN or reverse-proxy misconfiguration.

Related tools in your workflow:

Frequently Asked Questions

A

Indexable means Google or Bing is allowed to show this page in search results. If a page isn't indexable, it receives zero organic traffic — no matter how good the content is.

A page is marked ✅ Indexable when nothing is blocking it: the server responds normally, your robots.txt doesn't tell the bot to stay away, and there's no 'noindex' instruction in the page headers or HTML. If any single blocker is present, the page is marked ❌ Not Indexable and the tool shows exactly which signal caused it.

A

Most likely one of the other four signals is failing. Common causes in order of frequency:

  • HTTP status is not 2xx — a 4xx or 5xx response means the page cannot be crawled at all.
  • robots.txt blocks the crawler — Googlebot or Bingbot is disallowed before it even reads the HTML.
  • X-Robots-Tag header contains noindex — the HTTP header takes precedence over the HTML meta tag.
  • The page redirects to a noindexed destination — the redirect chain ends at a blocked page.

A

A conflict occurs when two signals contradict each other in a way that produces an unexpected result. The most dangerous is robots.txt blocking + noindex tag present: the crawler is blocked by robots.txt and therefore cannot read the noindex tag in the HTML. The page may still appear in Google's index as a URL-only result (no snippet) because Google knows the URL exists from links — the exact opposite of what the noindex was intended to achieve.

Another common conflict is X-Robots-Tag noindex overriding meta robots index: the HTTP header is checked before the HTML, so even if the HTML tag says index, the header wins.

A
The tool fetches each URL using the official Googlebot user-agent string and checks the five signals Google documents: HTTP status, robots.txt, meta robots, X-Robots-Tag, and rel=canonical. For robots.txt, the check is accurate — the parsing rules match what Google uses. The HTTP fetch comes from SERP.tools' server, not from Google's IP ranges — IP-based WAF rules may therefore behave differently from a real Googlebot request.

A
The robots.txt check is bot-specific: Googlebot and bingbot are different user-agent tokens and may be covered by different rules in robots.txt. Similarly, pages can serve a <meta name="bingbot" content="noindex"> tag to block only Bing, or an X-Robots-Tag: bingbot: noindex header. Bing covers Yahoo Search and Microsoft Copilot — so a Bingbot block is broader than it appears.

A
A cross-page canonical means the page's rel=canonical points to a different URL. This tells search engines "this URL is a duplicate — please index the canonical target instead." The page itself is unlikely to appear in search results, even if it passes all other signals. The tool flags this as a warning but does not automatically mark the page as non-indexable, because canonical hints are not strictly enforced — Google and Bing may choose to override them.

A
Run it after any of these events: CMS or platform migration, robots.txt change, CDN or WAF configuration update, deploy of a new template, bulk publishing of new pages, or any time you notice pages disappearing from Google Search Console's coverage report. For large sites, schedule a weekly audit of your highest-priority pages — new noindex tags are one of the most common accidental SEO regressions.