hreflang: The ultimate guide

hreflang tags are a technical solution for sites with similar content in multiple languages. The owner of a multilingual site wants search engines to send people to the content in their language. Say a user is Dutch, and the page that ranks is English, but there’s also a Dutch version. You would want Google to show the Dutch page in the search results for that Dutch user. This is the kind of problem hreflang is setting out to solve.

hreflang tags are among the most complex specs we’ve ever seen from a search engine. Because doing it right is challenging and takes time. That’s why this guide aims to prevent you from falling into common traps, so be sure to read it thoroughly if you’re embarking on an hreflang project.

Need help implementing hreflang as part of your international SEO project? Our Multilingual SEO training (part of our Yoast SEO academy training subscription) is designed to help you understand the process and put it into practice. You’ll have a killer international SEO strategy in no time.

What are hreflang tags for?

hreflang tags are a method to mark up pages with similar meanings but are aimed at different languages and/or regions. There are three common ways to implement hreflang:

  • Content with regional variations like en-us and en-gb.
  • Content in different languages like en, de and fr.
  • A combination of different languages and regional variations.

hreflang tags are commonly used to target different markets that use the same language – for example, to differentiate between the US and the UK or Germany and Austria.

What’s the SEO benefit of hreflang?

So why are we even talking about hreflang? What is the SEO benefit? From an SEO perspective, you should implement it for two main reasons.

First of all, if you have a version of a page that you have optimized for the users’ language and location, you want them to land on that page. Because having the correct language and location-dependent information improves their user experience and thus leads to fewer people bouncing back to the search results. And fewer people bouncing back to the search results leads to higher rankings.

The second reason is that hreflang prevents the problem of duplicate content. If you have the same content in English on different URLs aimed at the UK, the US, and Australia, the difference on these pages might be as small as a change in prices and currency. But without hreflang, Google might not understand what you’re trying to do and see it as duplicate content. With hreflang, you make it very clear to the search engine that it’s (almost) the same content, just optimized for different people.

What is hreflang?

hreflang is code, which you can show to search engines in three different ways – and there’s more on that below. Using this code, you specify all the different URLs on your site(s) with the same content. These URLs can have the same content in a foreign language or the same language but targeted at a different region.

What does hreflang achieve?

Who supports hreflang?

hreflang is supported by Google and Yandex. Bing doesn’t have an equivalent but does support language meta tags. For Bing, the content-language tag is a far stronger signal than hreflang.

In a complete hreflang implementation, every URL specifies which other variations are available. When a user searches, Google goes through the following process:

  1. it determines that it wants to rank a URL;
  2. checks whether that URL has hreflang annotations;
  3. it presents the searcher with the results with the most appropriate URL for that user.

The user’s current location and language settings determine the most appropriate URL (and to complicate things, a user can have multiple languages in their browser’s settings; and the order in which these languages appear determines the most appropriate language).

Should you use hreflang?

Tip: homepage first!

If you’re not sure on whether you want to implement hreflang on your entire site, start with your homepage! People searching for your brand will get the right page. This is a lot easier to implement and it will “catch” a large part of your traffic.

Now that we’ve learned what hreflang is and how it works, we can decide whether you should use it. Use hreflang when:

  • you have the same content in multiple languages;
  • you have content aimed at different geographic regions but in the same language.

It doesn’t matter whether your content resides on one domain or multiple domains. You can link variations within the same domain and between domains.

Hreflang_mistakes_FI

Architectural implementation choices

One thing is essential when implementing hreflang: don’t be too specific! Let’s say you have three types of pages:

  • Regular German language content
  • German language content, specifically aimed at Austria
  • German language content, specifically aimed at Switzerland

You could choose to implement them using three hreflang attributes like this:

  • de-de, which targets German speakers in Germany
  • de-at, targeting German speakers in Austria
  • de-ch, targeting German speakers in Switzerland

However, which of these three results should Google show to someone searching in German in Belgium, for example? In this case, the first page would probably be the best. To ensure that every user searching in German who does not match either de-at or de-ch gets the best default, we should change that hreflang attribute to just de. In many cases, specifying just the language is a smart thing to do.

It’s good to know that the most specific one wins when you create sets of links like this. The order in which the search engine sees the links doesn’t matter; it’ll always try to match from most specific to least specific.

Technical implementation – the basics

There are three basic rules regardless of which type of implementation you choose – and there’s more below.

1. Valid hreflang attributes

The hreflang attribute needs to contain a value that consists of the language, which can be combined with a region. In other words, the language attribute needs to be in ISO 639-1 format (a two-letter code).

Wrong region codes

Google can deal with some of the common mistakes with region codes, although you shouldn’t take any chances. For instance, it can deal with en-uk just as well as with the “correct” en-gb. However, en-eu does not work, as eu doesn’t define a country.

The region is optional and should be in ISO 3166-1 Alpha 2 format; more precisely, it should be an officially assigned element. Use this list from Wikipedia to verify you’re using the correct region and language codes. Because this is where things often go wrong: using the incorrect region code is a widespread problem.

The second basic rule is about return links. Regardless of your type of implementation, each URL needs return links to every other URL, and these links should point at the canonical versions; more on that below. The more languages you have, the more you might be tempted to limit those return links – but don’t. If you have 80 languages, you’ll have hreflang links for 80 URLs, and there’s no getting around it.

The third and final basic rule is about self-links. It may feel weird to do this, just as those return links might feel weird, but they are essential, and your implementation will not work without them.

Technical implementation choices

There are three ways to implement hreflang:

  • using link elements in the <head>
  • using HTTP headers
  • or using an XML sitemap.

Each has its uses, so we’ll explain them and discuss which you should choose.

The first method to implement hreflang we’ll discuss is HTML hreflang link elements. And you do this by adding code like this to the <head> section of every page:

<link rel="alternate" href="https://www.example.com/" hreflang="en" />
<link rel="alternate" href="https://www.example.com/en-gb/" hreflang="en-gb" />
<link rel="alternate" href="https://www.example.com/en-au/" hreflang="en-au" />

As every variation must link to every other variation, these implementations can become extensive and slow your site down. If you have twenty languages, choosing HTML link elements would mean adding twenty link elements, as shown above, to every page. That’s 1.5KB on every page load that no user will ever use but still have to download. On top of that, your CMS will have to do several database calls to generate all these links. This markup is purely meant for search engines. We would not recommend this for larger sites, as it adds too much unnecessary overhead.

2. hreflang HTTP headers

The second method of implementing hreflang is through HTTP headers. HTTP headers are for all your PDFs and other non-HTML content you might want to optimize. Link elements work nicely for HTML documents, but not for other types of content as you can’t include them. That’s where HTTP headers come in and they should look like this:

Link: <https://es.example.com/document.pdf>; rel="alternate"; hreflang="es",
      <https://en.example.com/document.pdf>; rel="alternate"; hreflang="en",
      <https://de.example.com/document.pdf>; rel="alternate"; hreflang="de"

Having a lot of HTTP headers is similar to the problem with link elements in your <head>: it adds a lot of overhead to every request.

3. An XML sitemap hreflang implementation

The third option for implementing hreflang is using XML sitemap markup. It uses the xhtml:link attribute in XML sitemaps to add the annotation to every URL. It works very much in the same way as you would in a page’s <head> with <link> elements. If you thought link elements were lengthy, the XML sitemap implementation is worse. To illustrate, this is the markup needed for just one URL with two other languages:

<url>
  <loc>https://www.example.com/uk/</loc> 
  <xhtml:link rel="alternate" hreflang="en" href="https://www.example.com/" /> 
  <xhtml:link rel="alternate" hreflang="en-au" href="https://www.example.com/au/" /> 
  <xhtml:link rel="alternate" hreflang="en-gb" href="https://www.example.com/uk/" />
</url>

You can see it has a self-referencing URL as the third URL, specifying the specific URL is meant for en-gb, and it defines two other languages. Now, both different URLs would need to be in the sitemap too, which looks like this:

<url>
  <loc>https://www.example.com/</loc> 
  <xhtml:link rel="alternate" hreflang="en" href="https://www.example.com/" /> 
  <xhtml:link rel="alternate" hreflang="en-au" href="https://www.example.com/au/" /> 
  <xhtml:link rel="alternate" hreflang="en-gb" href="https://www.example.com/uk/" />
</url>
<url>
  <loc>https://www.example.com/au/</loc> 
  <xhtml:link rel="alternate" hreflang="en" href="https://www.example.com/" /> 
  <xhtml:link rel="alternate" hreflang="en-au" href="https://www.example.com/au/" /> 
  <xhtml:link rel="alternate" hreflang="en-gb" href="https://www.example.com/uk/" />
</url>

As you can see, we’re only changing the URLs within the <loc> element, as everything else should be the same. Each URL has a self-referencing hreflang attribute and return links to the appropriate other URLs with this method.

XML sitemap markup like this is very verbose: you need a lot of output to do this for many URLs. The benefit of an XML sitemap implementation is simple: your normal users won’t be bothered with this markup. You don’t end up adding extra page weight, and it doesn’t require a lot of database calls on page load to generate this markup.

Another benefit of adding hreflang through the XML sitemap is that it’s usually a lot easier to change an XML sitemap than to change all the pages on a site. This way, there’s no need to go through large approval processes, and maybe you can even get direct access to the XML sitemap file.

Other technical aspects of an hreflang implementation

Once you’ve decided on your implementation method, there are a couple of other technical considerations you should know about before you start implementing hreflang.

hreflang x-default

x-default is a special hreflang attribute value that specifies where a user should be sent if none of the languages you’ve set in your other hreflang links match their browser settings. In a link element, it looks like this:

<link rel="alternate" href="https://www.example.com/"  hreflang="x-default" />

When it was introduced, it was explained as being for “international landing pages”, i.e., pages where you redirect users based on their location. However, it can be described as the final “catch-all” of all the hreflang statements. It’s where users will be sent if their location and language don’t match anything else.

In the German example, we mentioned above, a user searching in English still wouldn’t have a URL that fits them. That’s one of the cases where x-default comes into play. You’d add a fourth link to the markup and end up with these four:

  • de
  • de-at
  • de-ch
  • x-default

In this case, the x-default link would point to the same URL as the de one. We wouldn’t advise you to remove the de link, though, even though technically that would create precisely the same result. In the long run, it’s usually better to have both as it specifies the language of the de page – and it makes the code easier to read.

hreflang and rel=canonical

rel="canonical"

If you don’t know what rel=”canonical” is, read this article!

rel="alternate" hreflang="x" markup and rel="canonical" can and should be used together. Every language should have a rel="canonical" link pointing to itself. In the first example, this would look like this, assuming that we’re on the example.com homepage:

<link rel="canonical" href="https://www.example.com/">
<link rel="alternate" href="https://www.example.com/" hreflang="en" />
<link rel="alternate" href="https://www.example.com/en-gb/" hreflang="en-gb" />
<link rel="alternate" href="https://www.example.com/en-au/" hreflang="en-au" />

If we were on the en-gb page, only the canonical would change:

<link rel="canonical" href="https://www.example.com/en-gb/">
<link rel="alternate" href="https://www.example.com/" hreflang="en" />
<link rel="alternate" href="https://www.example.com/en-gb/" hreflang="en-gb" />
<link rel="alternate" href="https://www.example.com/en-au/" hreflang="en-au" />

Don’t mistake setting the canonical on the en-gb page to https://www.example.com/, as this breaks the implementation. The hreflang links must point to the canonical version of each URL because these systems should work hand in hand!

Useful tools when implementing hreflang

Now, when you’ve come this far, you’ll probably be thinking “wow this is hard”! We know – we thought that when we first started learning about it. But luckily, there are quite a few tools available if you dare to begin implementing hreflang.

hreflang tag generator

The hreflang tags generator tool

Aleyda Solis, who has also written a lot about this topic, has created a helpful hreflang tag generator that helps you generate link elements. Even when you’re not using a link element implementation, this can be useful to make some example code.

hreflang tag checker

To check the hreflang tags on your page or in your XML Sitemaps are correct, we suggest two tools. There’s the hreflang Tags Testing Tool by Merkle which allows you to insert a URL right away. In addition, there’s a Chrome extension that you can use while browsing a page. The tool takes a readout of a page’s hreflang tags and crawls them to assess if they back reference your current URL. The Hreflang Tag Checker can be found in the Chrome web store.

Making sure hreflang keeps working: process

Once you’ve created a working hreflang setup, you need to set up maintenance processes. It’s probably also a good idea to regularly audit your implementation to ensure it’s still set up correctly.

Make sure that people in your company who deal with content on your site know about hreflang so that they won’t do things that break your implementation. Two things are essential:

  1. When a page is deleted, check whether its counterparts are updated.
  2. When a page is redirected, change the hreflang URLs on its counterparts.

If you do that and audit regularly, you shouldn’t run into any issues.

Conclusion

Setting up hreflang is a cumbersome process. It’s a demanding standard with many specific things you should know and deal with. This guide will be updated as new things are introduced around this specification, and best practices evolve, so check back when you’re working on your implementation again!

Implementing hreflang is an essential part of technical SEO. But there’s more that you can work on to get your technical SEO in ship shape again. Find out how technically fit your website is with our technical SEO fitness quiz and find out what you can still work on.

Read more: WordPress SEO: The definitive guide to higher rankings for WordPress sites »

Coming up next!