There are multiple ways to tell search engines how to behave on your site. These are called “crawl directives”. They allow you to:
- tell a search engine to not crawl a page at all;
- not to use a page in its index after it has crawled it;
- whether to follow or not to follow links on that page;
- a lot of “minor” directives.
We write a lot about these crawl directives as they are a very important weapon in an SEO’s arsenal. We try to keep these articles up to date as standards and best practices evolve.
You probably know that Yoast SEO helps you determine what should and should not be indexed. But did you know that it also checks if your site is indexable or not? Thanks to Ryte, we can check if your site is still reachable for both search engine bots and visitors. This is the indexability check.
The robots.txt file is a file you can use to tell search engines where they can and cannot go on your site. Learn how to use it to your advantage!
Must read articles about Crawl directives
Trying to prevent indexing of your site by using robots.txt is a no-go, use X-Robots-Tag or a meta robots tag instead! Here's why.
The canonical URL allows you to tell search engines that certain similar URLs are actually one and the same. Learn how to use rel=canonical!
Want to keep a page out of the search results? Ask yourself if it should be on your site anyways. If it should, use a robots meta tag to prevent it from being indexed.
Search engines need a bit of help to qualify links; use the nofollow, sponsored and UGC attribute to help them out. With Yoast SEO it's easy!
Recent Crawl directives articles
How does a search engine get that search result? The process consists of three parts: crawling, indexing and ranking. Let's focus on indexing at Google.
What is crawlability? In what ways could you block Google from crawling (parts of your) site? Why is crawlability important for SEO?
Just pushed the button and launched your new site? Learn what to do first when you start working on SEO for a new website!