You can then select your website from the list. At the bottom of the dashboard page, there is a section called “Diagnostics & Tools”. Select the “fetch as Bingbot” option and then add your URL in the box:
Once you hit submit Bingbot will attempt to read your page. If the robots.txt is not blocking the URL then the response will be a 200 OK. This means that Bingbot was able to read your page and was not blocked.
You can test Googlebot using the Google Search Console.
Login to your account and then enter a URL into the “Inspect” box at the top of the page.
Once you enter your URL Googlebot will crawl the page and if all is ok you will see a screen like this:
If the robots.txt file is blocking the page you will see an error.
Let's look at some robots.txt examples.
Let's look at 5 different examples of robots.txt files. This can give you a good understanding of what they do and also you can use them as a template on your site.
The first example will block all web crawlers from crawling your site.
There are two rules here the first one is the User-agent rule this is set to a * which is a wildcard. This means that it will match all web crawlers that visit your site.
The second rule is the Disallow: / rule. This will block all the URLs on your site.
You may be wondering why you would want to stop bots from visiting your site. Well, this can be useful for pre-production or staging websites. Such as a site where you test development first before pushing the changes to production.
If your site does not have a robots.txt file then all the URLs on your site are available to web crawlers.
There are times when the software you are using requires you to have a robots.txt file. If this is the case then you can use this file:
This is a bit confusing as the Disallow rule is still listed. Yet, there is nothing after the Disallow which means there are no URLs allowed.
Block a folder
You may have directories that you want to remove from bots. This could be an admin area, an image section or user profiles.
In this case, you can disallow the folder like this:
This example will stop bots from visiting the admin area. The above rule Disallow: /admin/ will stop URLs like this: