Yoast SEO Premium settings: Crawl optimization

As of Yoast SEO 20.0, the settings interface inside our plugins has received a major overhaul. Please update to the latest version of Yoast SEO if your plugin does not look like the screenshots you encounter in our Help center.

Crawlability is essential in SEO. If you want search engines to find your site and show it in the search results, your site must be crawlable. Not only that, but you must ensure that search engines get a chance to crawl the pages that you want to rank with. There is no easy way to do that. But, with Yoast SEO Premium, you can clear all the URLs that don’t have any SEO value out of the search engine’s way.

There is also another, less talked about, side to this story. Crawling requires a lot of resources. Search engines and other parties like various apps, for instance, need a lot of electricity to crawl the growing number of sites and their URLs. Website owners also need powerful servers to make it possible for both visitors to visit and robots to crawl their sites. So, by making crawling more efficient, you contribute not only to your site’s SEO but also to consuming less electricity!

The crawl optimization settings in Yoast SEO Premium help you clear up unnecessary URLs and help search engines crawl your site more efficiently. You can find the settings by clicking Yoast SEO -> Settings -> Advanced -> Crawl optimization. Below, you can read about what the settings do.

Basic crawl optimization settings

Screenshot showing the crawl optimization settings in Yoast SEO Premium
The crawl optimization settings in Yoast SEO Premium

Unlike humans who read what’s on the front end of your site, robots read what they find in the source code. If you open the source code of your site (see the image below), you will notice many URLs there. When crawlers come to crawl your site, they’ll visit each one of the URLs they find. And they will do that tens or hundreds of times per day.

So, why is that a problem? Well, WordPress adds a lot of URLs and tags to your website’s header and <head> section. А lot of those additions are unnecessary, and they don’t have any SEO value. So, we’ve created multiple toggles that allow you to disable a specific piece of output. Below, you can read more about what each of the toggles does.

In the <head> section of a single post, WordPress creates an shortlink output (see example below).

<link rel='shortlink' href='http://testsite.com/?p=1' />

The short link is basically a shortened version of the URL of the same page. With this toggle, you can remove that output.

The WordPress REST API is a developer-oriented feature that lets applications interact with your WordPress site. Automatically, WordPress adds a REST API link to the <head> of your site for discoverability.

<link rel="https://api.w.org/" href="http://testsite.com/wp-json/" />

However, most sites don’t use the WordPress REST API. If your site is one of those, you can safely remove the link with this feature.

The RSD (Really Simple Discovery) link in the <head> of your site is for when you use these types of services. If you do not, it is safe to remove the link with this toggle.

<link rel="EditURI" type="application/rsd+xml" title="RSD" href="http://testsite.com/xmlrpc.php?rsd" />

The WLW link is intended for users of the discontinued Windows Live Writer. If you do not use it, you can safely remove this link as well.

<link rel="wlwmanifest" type="application/wlwmanifest+xml" href="http://testsite.com/wp-includes/wlwmanifest.xml" />

With this toggle, you can remove the oEmbed links from the <head> section of all your single posts.

<link rel="alternate" type="application/json+oembed" href="http://testsite.com/wp-json/oembed/1.0/embed?url=http%3A%2F%2Ftestsite.com%2F2022%2F05%2Fhello-world%2F" /><link rel="alternate" type="text/xml+oembed" href="http://testsite.com/wp-json/oembed/1.0/embed?url=http%3A%2F%2Ftestsite.com%2F2022%2F05%2Fhello-world%2F&format=xml" />

These links help other sites consume your content. You won’t harm any of your content by removing them.

Generator tag

The generator tag displays the WordPress version your site is using.

<meta name="generator" content="WordPress 6.0" />

This tag has no SEO value, and, in fact, it can potentially be a security threat. So, you can easily remove it with this toggle.

Emoji scripts

If you don’t use emojis in your content, you most likely won’t need the emoji support WordPress adds to your site. With this toggle, you will remove the <link rel='dns-prefetch' href='//s.w.org' /> line, as well as a long section of script related to emojis from your site.

Pingback HTTP header

Pingbacks are used to notify you when someone has added a link to your site. However, this standard is very old, and you are most likely not using it anymore. If you switch the toggle to remove, it will remove the X-Pingback: http://testsite.com/xmlrpc.php from the response header.

Powered by HTTP header

With this toggle, you remove the information about the PHP version your site is using from the response header. This information is not required for your site to function properly, so you can safely remove it.

Feed crawl settings

Your site probably has more URLs than you realize. For instance, WordPress creates feeds for a lot of content on your site, which can be a problem for crawlers. A crawler will start crawling the URLs, and, at some point, it might run out of crawl budget. As a result, there won’t be any budget left for your important posts and pages. That’s why it’s wise to remove those URLs and let search engines crawl your site more efficiently.

In the Yoast SEO Premium Crawl settings, you can toggle multiple switches that let you keep or remove the various feeds. We don’t automatically remove them for each site because we can’t predict the needs of all Yoast SEO Premium users. But, if you don’t get any value from them, we recommend you switch the toggles on. Below, you can see exactly which feeds you can remove with the Crawl settings.

Global feed

  • Type of page: any page
  • example feed: http://testsite.com/feed/

Global comments feed

  • Type of page: any page
  • Example feed: http://testsite.com/comments/feed/

Note: disabling this feed will also disable the post comments feeds.

Post author feeds

  • Type of page: author archive, e.g., http://testsite.com/author/admin/
  • Example feed: http://testsite.com/author/admin/feed/

Post type feeds

  • Type of page: post type archive, e.g., http://basic.wordpress.test/my-books/
  • Example feed: http://testsite.com/my-books/feed/

Category feeds

  • Type of page: category archive, e.g., http://testsite.com/fiction/
  • Example feed: http://testsite.com/category/fiction/feed/

Tag feeds

  • Type of page: tag archive, e.g., http://testsite.com/tag/fantasy/
  • Example feed: http://testsite.com/tag/fantasy/feed/

Custom taxonomy feeds

  • Type of page: custom taxonomy archive, e.g., http://testsite.com/book-genre/crime/
  • Example feed: http://testsite.com/book-genre/crime/feed/

Search results feeds

  • Type of page: search results, e.g., http://testsite.com/?s=world
  • Example feed: http://basic.wordpress.test/search/world/feed/rss2/

Atom/RDF feeds

  • Type of page: any page
  • Example feed: any feed listed above, adding /atom or /rdf in the end, e.g.:
    • http://testsite.com/feed/atom
    • http://testsite.com/feed/rdf
    • http://testsite.com/comments/feed/atom
    • http://testsite.com/comments/feed/rdf
    • http://testsite.com/hello-world/feed/atom
    • http://testsite.com/hello-world/feed/rdf

Remove unused resources

  • Emoji scripts
    Remove JavaScript used for converting emoji characters in older browsers.
  • Prevent search engines from crawling /wp-json/
    Add a ‘disallow’ rule to your robots.txt file to prevent crawling of WordPress’ JSON API endpoints. E.g https://www.example.com/wp-json/ and https://www.example.com/?rest_route=/

Internal site search cleanup

Spammers sometimes target internal site search URLs on your site for their own purposes. Those URLs might get crawled by Google, and might be seen by users. That can harm your SEO (and your branding)! This feature identifies some common spam patterns and stops them in their tracks. The common spam patterns our plugin cleans up are: TALK: QQ: [:()【】[]].

Redirect pretty URLs for search pages to raw format (Premium)

WordPress supports two endpoint formats for site search queries:

  • A raw format: example.com/?s=example
  • A pretty format:example.com/search/example

The pretty format will only be supported when pretty permalinks are enabled. When both formats exist, this can lead to problems, because this doubles the number of URLs that search engines can crawl. In addition, it can increase the number of ways in which your site can be attacked by spammers.

Therefore, we provide an option to turn off one of these formats in Yoast SEO Premium. When you switch the toggle behind “Redirect pretty URLs for search pages to raw format” to on, Yoast SEO disables the pretty format. The plugin then redirects requests from the pretty format to the raw format, while maintaining any query parameters and/or pagination. That’s good for your SEO and for the environment!

We disable the pretty format because the raw format is relatively universal and language- and territory-agnostic and more (natively) interoperable with most analytics and tracking systems.

Advanced: URL cleanup

These are advanced settings that you should only use if you know what you are doing! To learn more read the Advanced crawl settings article.

Unlock the powerful Crawl settings in Yoast SEO Premium!

Let your site be efficiently crawled, improve its SEO and save electricity, with just a few clicks!

Get Yoast SEO Premium Only $99 USD / year (ex VAT)

Will using the crawl settings affect my site’s rankings?

We understand that this all might sound a bit scary. But don’t worry, using the crawl settings in Yoast SEO will not harm your website’s crawlability or rankings. The crawl settings are there to help you clean up unnecessary URLs and can help search engines crawl your site more efficiently.

In addition to this, it’s important to note that the crawl settings in Yoast SEO do not have any effect on the crawl rate of a website. This means that the speed at which search engines, like Google, crawl and index a website is not impacted by the crawl settings in Yoast SEO.

Read more

Want to know more about crawling and how it affects the environment? Check out these links:

Related articles

Get free SEO tips!