googlebot prevent your site being indexed

SEO Basics
What is Googlebot?

SEO basics: What is Googlebot?

November 21st, 2017 – 14 Comments

Whenever I think of Googlebot, I see a cute, smart Wall-E like robot speeding off on a quest to find and index knowledge in all corners of yet unknown worlds. It’s always slightly disappointing to be reminded that Googlebot is ‘only’ a computer program written by Google that crawls the web and adds pages to its index. Here, I’ll introduce you to the crawler and show you what it does.

Optimize your site for search & social media and keep it optimized with Yoast SEO Premium »

Yoast SEO: the #1 WordPress SEO plugin Info

Googlebot? Web crawler? Spider? Huh?

All those terms mean the same thing: it’s a bot that crawls the web. Googlebot crawls web pages via links. It finds and reads new and updated content and suggests what should be added to the index. The index, of course, is Google’s brain. This is where all the knowledge resides. Google uses a ton of computers to send their crawlers to every nook and cranny of the web to find these pages and to see what’s on them. Googlebot is Google’s web crawler or robot and other search engines have their own.

How does Googlebot work?

Googlebot uses sitemaps and databases of links discovered during previous crawls to determine where to go next. Whenever the crawler finds new links on a site, it adds them to the list of pages to visit next. If Googlebot finds changes in the links or broken links, it will make a note of that so the index can be updated. The program determines how often it will crawl pages. To make sure Googlebot can correctly index your site, you need to check its crawlability. If your site is available to crawlers they come around often.

Different robots

There are several different robots. For instance, the AdSense and AdsBot check ad quality, while Mobile Apps Android checks Android apps. For us, these are most important ones:

Name User-agent
Googlebot (desktop) Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Googlebot (mobile) Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Googlebot Video Googlebot-Video/1.0
Googlebot Images Googlebot-Image/1.0
Googlebot News Googlebot-News

How Googlebot visits your site

To find out how often Googlebot visits your site and what it does there, you can dive into your log files or open the Crawl section of Google Search Console. If you want to do really advanced stuff to optimize the crawl performance of your site, you can use tools like Kibana or the SEO Log File Analyser by Screaming Frog.

Google does not share lists of IP addresses that the various Googlebots use, since these addresses change often. To find out if a real Googlebot visits your site, you can do a reverse IP lookup. Spammer or fakers can easily spoof a user-agent name, but not an IP address. Here’s Google’s example of verifying the validity of a Googlebot.

You can use the robots.txt to determine how Googlebot visits – parts of – your site. Watch out though, if you do this the wrong way, you might stop Googlebot from coming altogether. This will take your site out of the index. There are better ways to prevent your site from being indexed.

Google Search Console

Search Console is one of the most important tools to check the crawlability of your site. There, you can verify how Googlebot sees your site. You’ll also get a list of crawl errors for your to fix. In Search Console, you can also ask Googlebot to recrawl your site. Another way to fix these crawl errors is by connecting Yoast SEO to Search Console. You can import your errors and fix them straight from the backend of your site. Yoast SEO Premium can do even more to make your SEO easier.

Optimize for Googlebot

Getting Googlebot to crawl your site faster is a fairly technical process that boils down to bringing down the technical barriers that prohibit the crawler from accessing your site properly. It is a fairly technical process, but you should make yourself familiar with that. If Google can’t crawl your site perfectly well, it can never make it rank for you. Find those errors and fix them!

Conclusion

Googlebot is the little robot that visits your site. If you’ve made technically sound choices for your site, it’ll come often. If you regularly add fresh content it’ll come around more often. Sometimes, whenever you’ve made large-scale changes to your site, you might have to call that cute little crawler to come at once, so the changes can be reflected in the search results as soon as possible.

Read more: ‘SEO basics: What does Google do?’ »


14 Responses to SEO basics: What is Googlebot?

  1. Mirza Atif
    By Mirza Atif on 27 November, 2017

    Perfect piece of information about Google Bots. Most of the newbies are unaware of technicalities and only thinking that link building is SEO. Thank you for sharing.

  2. Cassper Nyovest
    By Cassper Nyovest on 22 November, 2017

    my advice for newbies, always never forget to add the different bots to your robots.txt files as it helps them crawl your site without issues

  3. Rajinder Verma
    By Rajinder Verma on 22 November, 2017

    Hey Edwin,
    I am a newbie blogger and don’t know much Google bots and other crawling techniques used by Google. But after reading this valuable post on Google Bot… I have cleared my some basic points on Google Bots!

    Well Written Post! Keep up the awesome work!
    -Rajinder

    • havu
      By havu on 22 November, 2017

      I also like you, hope that through the posts on me and you do not make mistakes

    • Edwin Toonen
      By Edwin Toonen on 22 November, 2017

      Thanks, Rajinder!

  4. zunairnasir
    By zunairnasir on 22 November, 2017

    A very written article by Edwin. It’s definitely a must read for all those webmasters who are new in SEO and want to learn about Google Bots.

  5. Deep
    By Deep on 22 November, 2017

    Hi Edwin, this is a good article. I myself use Yoast premium. It helps me a lot in tracking those errors and at the same time creates redirects.
    I keep a regular watch on Google search console and see if any error emerges.

    • Edwin Toonen
      By Edwin Toonen on 22 November, 2017

      Very good, Deep! Search Console is your best friend.

  6. nexvan
    By nexvan on 22 November, 2017

    Thanks for sharing

  7. nargile
    By nargile on 21 November, 2017

    Good article.. I HV been facing 302 error and tried to resolve..

  8. Andrej Svetelj
    By Andrej Svetelj on 21 November, 2017

    I really like this explanation. Thank you Edwin and Yoast team.

    • Edwin Toonen
      By Edwin Toonen on 22 November, 2017

      Thanks for reading, Andrej.

  9. Otlaat
    By Otlaat on 21 November, 2017

    Hi Edwin,
    I have read a lot lately that robots.txt should not be used because it gives attackers a road map to hacking opportunities

  10. nexvan
    By nexvan on 21 November, 2017

    Good article, thank you


Check out our must read articles about SEO basics