rel=canonical • What it is and how (not) to use it

In February 2009, six years to the day from when this is published, Google, Bing and Yahoo! introduced the rel=canonical link element (Matt’s post is probably the easiest reading). While the idea is simple, the specifics of how to use it turn out to be complex. The basic premise is: if you have several similar versions of the same content, you pick one “canonical” version and point the search engines at that. This solves a duplicate content problem where search engines don’t know which version of the content to show. This article takes you through the use cases and the anti use cases.

canonical graphic 1024x630

Easiest correct example of using rel=canonical

Let’s assume you have two versions of the same page. Exactly, 100% the same content. They differ in that they’re in separate sections of your site and because of that the background color and the active menu item differ. That’s it. Both versions have been linked from other sites, the content itself is clearly valuable. Which version should a search engine show? Nobody knows.

For example’s sake, these are their URLs:

  • http://example.com/wordpress/seo-plugin/
  • http://example.com/wordpress/seo/seo-plugin/

This is what rel=canonical was invented for. Especially in a lot of e-commerce systems, this (unfortunately) happens fairly often, where a product has several different URLs depending on how you got there. You would apply rel=canonical in the following method:

  1. You pick one of your two pages as the canonical version. It should be the version you think is the most important one. If you don’t care, pick the one with the most links or visitors. If all of that’s equal: flip a coin. You need to choose.
  2. Add a rel=canonical link from the non-canonical page to the canonical one. So if we picked the shortest URL as our canonical URL, the other URL would link to the shortest URL like so in the <head> section of the page:
    <link rel="canonical" href="http://example.com/wordpress/seo-plugin/">

    That’s it. Nothing more, nothing less.

What this does technically is “merge” the two pages into one from a search engine’s perspective. It’s basically a sort of “soft redirect”, without redirecting the user. Links to both URLs now count for the single canonical version of the URL.

Should a page have a self referencing canonical?

In the example above, we make the non-canonical page link to the canonical version. But should a page set a rel canonical for itself? This is actually discussed every once in a while. I have a strong preference for having a canonical link element on every page. The reason is that most CMSes will allow URL parameters without changing the content. So these would show the same content:

  • http://example.com/wordpress/seo-plugin/
  • http://example.com/wordpress/seo-plugin/?isnt=it-awesome
  • http://example.com/wordpress/seo-plugin/?cmpgn=twitter
  • http://example.com/wordpress/seo-plugin/?cmpgn=facebook

etc. You see my point. If you don’t have a self referencing canonical on the page that points to the cleanest version of the URL, you risk being hit by this stuff. Even if you don’t do it yourself, someone else could do this to you and cause a duplicate content issue. So adding a self referencing canonical to URLs across your site is a good “defensive” SEO move. Luckily for you, our WordPress SEO plugin does this for you.

Cross domain canonical

Now, you might have the same piece of content on several domains. For instance, SearchEngineJournal regularly republishes articles from Yoast.com (with explicit permission). Look at every one of those articles and you’ll see a rel=canonical link point right back at our original article. This means all the links pointing at their version of the article count towards the ranking of our canonical version. They get to use our content to please their audience, we get a clear benefit from it too. Everybody wins.

The risk of faulty canonicals: common errors

There are a multitude of cases out there showing that a wrong rel=canonical implementation can lead to huge issues. I know of several sites that had the canonical on their homepage point to an article, and completely lost their homepage from the search results. There are more things you shouldn’t do with rel=canonical. Let me list the most important ones:

  • Don’t canonicalize a paginated archive to page 1. Don’t add a rel=canonical on page 2 and further, search engines will actually not index the links on those deeper archive pages anymore…
  • Make them 100% specific. For various reasons, a ton of sites use protocol relative links, meaning they leave the http / https bit from their URLs. Don’t do this for your canonicals. You have a preference. Show it.
  • Base your canonical on the request URL. If you use variables like the domain or request URI used to access the current page while generating your canonical, you’re doing it wrong. Your content should be aware of its own URLs. Otherwise, you could still have the same piece of content on for instance example.com and www.example.com and have them both canonicalize to themselves.
  • Multiple rel=canonical links on a page causing havoc. Sometimes a developer of a plugin or extensions thinks that he’s God’s greatest gift to mankind and he knows best how to add a canonical to the page. Sometimes, that developer is right. But since you can’t all be me, they’re inevitably wrong too sometimes. When we encounter this in WordPress plugins we try to reach out to the developer doing it and teach them not to, but it happens. And when it does, the results are wholly unpredictable.

Good to know: rel=canonical and social networks

Facebook and Twitter honor rel=canonical too. This might lead to weird situations. If you share a URL on Facebook that has a canonical pointing elsewhere, Facebook will share the details from the canonical URL. In fact, if you add a like button on a page that has a canonical pointing elsewhere, it will show the like count for the canonical URL, not for the current URL. Twitter works in the same way.

Setting the canonical in WordPress SEO

If you use WordPress SEO, you can change the canonical of several page types using the plugin. You only need to do this if you want to change the canonical to something different than the current page’s URL. WordPress SEO already renders the correct canonical URL for almost any page type in a WordPress install.

For posts, pages and custom post types, you can edit the canonical in the advanced tab of the WordPress SEO metabox:

Change the rel=canonical in WordPress SEO

For categories, tags and other taxonomy terms, you can change them here:

Change the canonical of a category, tag or custom taxonomy term on the edit term page

If you have other advanced use cases, you can always use the wpseo_canonical filter to change the WordPress SEO output.

Advanced uses of rel=canonical

Site migrations (use with care)

Sometimes, when you’re moving a site from one domain to another, you might want to “soft-launch” the new site. This could for instance be the case when you’re combining the migration with a rebrand and redesign and you want to let people get used to the new brand for a while first before you finally flip the switch.

When you do something like this, you could have a complex rel=canonical scheme where at first, you canonicalize the new site to the old one and then after a month or so flip the direction of the canonicals to the new site. This would prevent the new site from showing up in the search results during the first month and would then slowly start the migration process in the second month. Do not leave this online forever though, 301 redirect to the new domain at some point. A 301 redirect is still a far more reliable and more widely trusted method of moving content.

Canonical link HTTP header

Google also supports a canonical link HTTP header. While these can be very useful if you’re a savvy server admin, they also tend to get abused by hackers a lot. It’s hard to spot these if you’re not a pro and all the link juice for your page might be pointing at someone else without you ever noticing until the page drops out of the search results…

They can be very useful though, for instance for canonicalizing PDFs, so it’s good to know that the option exists.

Using rel=canonical on not so similar pages

While I won’t recommend this, you can definitely use rel=canonical very aggressively. Google honors it to an almost ridiculous extent, where you can canonicalize a very different piece of content to another piece of content. If Google catches you doing this though, it might stop trusting your site’s canonicals and thus cause you more harm…

Conclusion: rel=canonical is a power tool

Rel=canonical has, in the 6 years of its existence, turned into a powerful tool in an SEO’s toolbox, but like any power tool, you should use it wisely as it’s easy to cut yourself. We’re curious as to what the next 6 years of canonical will bring.




28 Responses

  1. Maciej
    By Maciej on 12 February, 2015

    Hi!

    Thank you for the great article.
    On our website we publish monthly reports and rankings, content isn’t always the same (it is not copy/paste) but generally has the same data in different order. There is also some comment attached to this (always different). We started to use rel=canonical and we can see great results from that+under the important phrases we always have the newest content in google. Isn’t it risky to do such things?

    Cheers!

    • Michiel Heijmans
      By Michiel Heijmans on 13 February, 2015

      Why would you think that, Maciej?

      • Te Zet
        By Te Zet on 13 February, 2015

        Because the content itself is different.

  2. Ruth
    By Ruth on 12 February, 2015

    Is there any plan to add rel=canonical to RSS/XML feeds so folks using your content will have that link back automatically? I suppose they’d still have to grab/use that field so it wouldn’t be foolproof. Can you speak to this?

    • Michiel Heijmans
      By Michiel Heijmans on 13 February, 2015

      It’s indeed an easy way to scrape content as well. Most of the times, this is done automatically – that is why we added the option to add a backlink to your content in RSS in our WordPress SEO plugin. That will make sure there is a link back to your site. Google will understand from the link that your website was the first to publish the content and will value your page much higher. It’s not the same, but basically the same indication to Google. Hope that helps!

  3. Anders
    By Anders on 12 February, 2015

    Hi,

    Ok I read this article but still not sure if to use rel=canonical for this scenario for wordpress:

    Site has a page for Areas Covered like this:
    http://sitename.com/areas-covered
    On that page there is a list of different areas covered, where each area covered area has a link to a page that is almost identical to the areas covered page above, apart from the link is unique for each area covered and title pages title and unique area named is use.
    Example of the other pages;
    http://sitename.com/areas-covered-locationA
    http://sitename.com/areas-covered-locationB
    http://sitename.com/areas-covered-locationC
    http://sitename.com/areas-covered-locationD
    http://sitename.com/areas-covered-locationE
    http://sitename.com/areas-covered-locationF
    etc….etc….

    These other pages all has the list of areas on it and all link to each other area page.

    What is the correct procedure here?

    As mentioned the pages are not identical because the is unique for each page i.e. LocationA and some words have been changed (but 99% is the same perhaps initially).

    The plan would be to later add more unique location images for each of all the location pages, perhaps write some more unique things about the area for each area page, but generally they are almost identical.

    Finally, if it’s a bad idea in the first place to have one similar page with a different URL for each area, then if ref=canocial is not the solution, would it be better to leave as-is or to hide all the separate area pages with a hide page plugin and stop robots from listing/indexing or caching these pages, only keeping the main page “Areas Covered”?

    I would appreciate advice that is simple to understand for a non-expert.

    Thanks
    Anders

    • Anders Sundstedt
      By Anders Sundstedt on 13 February, 2015

      Anyone that can reply to this one please?

    • Michiel Heijmans
      By Michiel Heijmans on 13 February, 2015

      That’s plain duplicate content. If the majority of the actual content on a page is the same, it’s duplicate.

      We recently published a very nice post on local SEO by Kris Jones you might want to check and please understand that the way you mention has past it’s expiration date. Might work for now, but usually short term. In the long run, local rankings are for local companies.

  4. Christian
    By Christian on 12 February, 2015

    Hi,
    Great post, very clear explanation.
    I had read though that rel=canonical should not be used with hreflang tags (which I find is not so well taken into account by Google).
    https://sites.google.com/site/webmasterhelpforum/en/faq-internationalisation (last QA in Rel Alternate Hreflang sectoin)
    Thanks!
    Christian

    • Michiel Heijmans
      By Michiel Heijmans on 13 February, 2015

      That’s not what it says, Christian :) You shouldn’t use canonical LIKE hreflang. In case of a multilingual website, the canonical goes to the page at hand (if appropriate) and the hreflang / alternates go to the other languages. So you can use these alongside each other, but don’t use canonical links for pointing to other languages to rank.

      The exact phrase on the linked page is: “We recommend not using rel=canonical across different language or country versions.”

      • Christian
        By Christian on 13 February, 2015

        Your are right, Michiel. Thanks!

  5. Nigam
    By Nigam on 13 February, 2015

    @Joost:
    Thanks for keep sharing such useful and informative articles.
    Indeed, re=canonical will be proven as A powerful tool if used wisely.
    Many of SEO experts uses this not only to avoid DUPLICATE content issue but also COPYRIGHT issue as well. What’s your opinion on this?
    Thanks,

    Nigam Parikh
    Mumbai,India.

    • Michiel Heijmans
      By Michiel Heijmans on 13 February, 2015

      Who says that accounts for copyright? I’d add a copyright statement in your footer / terms instead. Canonical isn’t anything that gives you rights..?

  6. kalyan
    By kalyan on 13 February, 2015

    Hi, i recently changed the category names in the site and removed them from parent category.this created duplicate meta tags and title tags which i noticed in webmaster tools, Can I use redirection plugin ?or as said in the above post can I use settings in your plugin.if yes please specify where ? as I have no technical experience

    The first is the new url and cat ; the second is the old one.
    /gs1/epw-disabled-citizens-disability-certificate/932/
    /p2/society/epw-disabled-citizens-disability-certificate/932/

  7. Patrice
    By Patrice on 13 February, 2015

    Until i read your article, i didn’t even know that they existed , it would be awesome to find a tool that checks for “compromised headers”, do you know if such a tool exist ?

    • Raoul de Boer
      By Raoul de Boer on 13 February, 2015

      I agree! Let me know when you find such a tool :-)

  8. David Sottimano
    By David Sottimano on 13 February, 2015

    Good article, solid advice, except one thing. I respectfully disagree with usage of cross domain rel canonicals for site migrations / rebrands. I know you added a line saying 301s are always the best, but that section of the article might confuse a lot of non SEOs as to what to do during migrations. Site migrations are stressful enough with 301s, so using a combination of rel canonicals, then 301s is sending 2 signals rather than 1, which significantly increases the chances of error (both on Google’s part and your own). Maybe you know something I don’t? ;)

  9. James
    By James on 13 February, 2015

    Awesome post Joost! Could have done with it last week whilst trying to justify a site restructure to a client.
    Bookmarked

  10. Siddharth T. Patel
    By Siddharth T. Patel on 16 February, 2015

    Very informative blog post. Everything makes sense though. Keep up the good work ahead too. Really appreciate the effort.

  11. Dennis
    By Dennis on 16 February, 2015

    Great stuff reading Yoast. I really think you came into the depth with canonical. I did not know, that Facebook is giving credit to the original source though, so thank you for new information!

  12. Paul Altieri
    By Paul Altieri on 17 February, 2015

    Hi Joost,

    Nice read…. site looks great. thanks for the tips.

  13. Arne van Elk
    By Arne van Elk on 17 February, 2015

    Hi Joost, have a question about canonicals and pagination. You say: “Don’t canonicalize a paginated archive to page 1. Don’t add a rel=canonical on page 2 and further, search engines will actually not index the links on those deeper archive pages anymore”.

    I wonder if this is true. Apart from indexation, does a canonical tell the search engine a page should not be followed as well? It seems to me these are two different things.

  14. viki sangre
    By viki sangre on 19 February, 2015

    I think if we can’t use canonical tag for either URL then it leads to cloaking. My websites many pages we detected as cloaked by several tools. After using canonical tag on archive everything goes right.
    Thank for sharing

  15. James Kockelbergh
    By James Kockelbergh on 20 February, 2015

    Hi Joost,

    As always great article.

    I am currently cleaning up a clients website where previously somebody did some real harm by “Using rel=canonical on not so similar pages” to boost link juice to a few pages on the website. Google penalized this website badly as they lost most of their traffic from search based terms.

    Would you recommend a manual review by Google after cleaning up the website?

    If you wish to add any further recommendations they will be appreciated.

    Best regards,

    James.

  16. jashon
    By jashon on 21 February, 2015

    So if we put self-pointing canonical in every-page we are safe from duplicate content issue from those rouge requests? that’s nice was always having problem with that.

  17. Shawn
    By Shawn on 24 February, 2015

    Good article. Curious about URL cloaking for FB and Twitter. We carefully canonical everything back to a single canonical URL across multiple sites where content might be viewed, but this creates problems with FB and Twitter sharing esp. when sites are branded or private labeled with their own FB campaigns. I saw on Stack that FB is okay with cloaking the canonical URL so it’s different (local) for FB and Twitter but constant for Google and other bots. Is this valid / white hat / the right way to handle this?