Question 1

Why is a page showing as blocked even though I can see it in Google?

Accepted Answer

robots.txt controls whether Google can crawl a page, not whether it appears in search results. Google may have indexed the page from a previous crawl before the block was added, or learned about the URL from links on other sites and shown it without crawling the content. Blocking in robots.txt does not remove a page from the index — for that you need a noindex directive.

Question 2

What is the difference between User-agent: * and User-agent: Googlebot?

Accepted Answer

User-agent: * applies to all crawlers that do not have their own named rule block. User-agent: Googlebot applies specifically to Googlebot and overrides the * rules for it. If you have Disallow: / under User-agent: * but Allow: / under User-agent: Googlebot, Google will crawl the site while all other bots are blocked.

Question 3

Does blocking GPTBot in robots.txt also block ChatGPT from reading my pages?

Accepted Answer

No — GPTBot and ChatGPT-User are different user agents from OpenAI with different purposes. GPTBot is used for training-data collection; blocking it keeps your content out of future model training datasets. ChatGPT-User is the agent used when a ChatGPT user triggers a web search or browsing action. Blocking GPTBot does not block ChatGPT-User, and vice versa.

Question 4

I see a Crawl-delay directive — does this tool check that?

Accepted Answer

This tool reports a Crawl-delay directive if present, but does not enforce it during checks. Crawl-delay is not part of the official robots.txt standard and is ignored by Google. Bing and some other crawlers do honour it. It is shown for informational purposes only.

Question 5

How does the custom robots.txt mode work?

Accepted Answer

Switch to Custom robots.txt, paste (or fetch and edit) a robots.txt, then enter the URLs you want to verify. The custom file is tested against every URL regardless of its domain, so you can confirm that the pages you need indexed stay allowed — and the pages you want blocked are actually blocked — before publishing any change to your live site.

Robots.txt Validator & Bulk URL Tester

1. Choose a robots.txt

2. Check URLs

How robots.txt Controls Crawling

How Conflicting Rules Are Resolved

Testing Multiple User Agents

Validate Changes Before You Publish

Related Tools

Frequently Asked Questions

Robots.txt Validator & Bulk URL Tester (()=>{var e=async t=>{await(await t())()};(self.Astro||(self.Astro={})).only=e;window.dispatchEvent(new Event("astro:only"));})();