Robots.txt Validator & Bulk URL Tester
Check If Pages Are Blocked
Fetch any site's robots.txt, then test up to 100 URLs against it for Googlebot, Bingbot, GPTBot and more. See the exact rule — and the line number — that blocks each page, or paste a custom robots.txt to validate changes before you publish them.
1. Choose a robots.txt
We fetch https://domain/robots.txt
2. Check URLs
How robots.txt Controls Crawling
A robots.txt file lives at the root of every domain
(yourdomain.com/robots.txt) and is the first thing a crawler
reads before requesting any page. It uses a simple set of directives — User-agent, Allow, and Disallow —
to tell each bot which paths it may and may not crawl. A single misplaced Disallow: / can hide an entire site from search engines, so
verifying the file against your real URLs matters far more than checking
its syntax in isolation.
This tool does exactly that. It fetches the live robots.txt for each domain you test, parses it the same way Googlebot does, and reports — per URL and per user agent — whether the page is allowed, and if not, the precise rule and line number responsible.
How Conflicting Rules Are Resolved
When more than one rule could apply to a URL, search engines use the most specific match — the rule with the longest path
pattern wins, regardless of whether it is an Allow or a Disallow. This is
why Allow: /blog/
can override a broader Disallow: /
for pages under /blog/. The tool surfaces the actual winning
rule for each URL so you never have to trace the precedence by hand.
✅ Allowed
No matching Disallow rule, or a more specific Allow rule overrides a broader Disallow. The crawler may fetch the page.
❌ Blocked
A Disallow rule matches and is not overridden. Click any blocked cell to see the exact rule and jump to its line in the robots.txt viewer.
Testing Multiple User Agents
robots.txt rules can target specific crawlers. A site might welcome
Googlebot while blocking AI training bots like GPTBot and
Google-Extended — or accidentally block Bingbot with an overly broad
rule. Select any combination of user agents and the tool evaluates each
URL against the rule block that applies to that specific bot, including
the catch-all User-agent: * fallback.
- •Googlebot — Google Search crawler
- •Bingbot — Microsoft Bing crawler
- •GPTBot — OpenAI training crawler
- •ClaudeBot — Anthropic training crawler
- •Google-Extended — Google AI (Gemini) training
- •All bots (*) — the catch-all rule block
Validate Changes Before You Publish
Editing robots.txt on a live site is risky — one wrong rule can deindex thousands of pages. Switch to Custom robots.txt mode, paste your proposed file (or fetch the live one and edit it), and test it against the URLs that matter most. Confirm that key pages stay allowed and the paths you want hidden are actually blocked, all before the change ever reaches production.
Related Tools
Frequently Asked Questions
Why is a page showing as blocked even though I can see it in Google?▾
noindex directive.What is the difference between User-agent: * and User-agent: Googlebot?▾
User-agent: * applies to all crawlers that do not have their own named rule block. User-agent: Googlebot applies specifically to Googlebot and overrides the * rules for it. If you have Disallow: / under User-agent: * but Allow: / under User-agent: Googlebot, Google will crawl the site while all other bots are blocked.Does blocking GPTBot in robots.txt also block ChatGPT from reading my pages?▾
I see a Crawl-delay directive — does this tool check that?▾
How does the custom robots.txt mode work?▾
Free account unlocks more
Now (no account)
- –AI calls: 3 per session
- –Tool result history: Not saved
- –Referral bonus AI calls: —
- –Early access to new tools: —
Free account
- ✓AI calls: 10 per day
- ✓Tool result history: Last 30 days
- ✓Referral bonus AI calls: +50 on referral
- ✓Early access to new tools: Newsletter updates
