Crawl Error: “URL restricted by robots.txt”
This error appears when Google is unable to crawl the specific URL due to a robots.txt restriction.
The URL Should Not Be Crawled
You are setup correctly. Google will respect the request and not crawl the URL.
The URL Should Be Crawled
Google is unable to crawl this URL for a variety of reasons. The following places may contain a statement that prevents crawling.
- The robots meta tag – View the page source of the URL looking specifically for something that looks like this:<meta name=”robots” content=”noindex” />
- The X-Robots-Tag HTTP header – Check your HTTP headers for a noindex or noarchive or other negative header. You can modify the X-Robots-Tag using .htaccess and httpd.conf files that are available by default on Apache based web servers. Find out how here.X-Robots-Tag: noindex
- The actual robots.txt file – Typically, this file is located at http://www.[yourdomainname].com/robots.txt, however, they can exist anywhere within your domain. If you are unsure where your robots.txt file resides, please contact your webhost for assistance. Once you have located and identified the robots.txt file is causing the problem, you can fix blocked resources with these steps.Disallow: [Path Or / ]
In some cases, Google may not be able to reach your server and this will resolve itself in a few days. If the warning continues to appear, please reach out to Google for further assistance.