rel=canonical • What it is and how (not) to use it
In February 2009, six years to the day from when this is published, Google, Bing and Yahoo! introduced the rel=canonical link element (Matt’s post is probably the easiest reading). While the idea is simple, the specifics of how to use it turn out to be complex. The basic premise is: if you have several similar versions of the same content, you pick one “canonical” version and point the search engines at that. This solves a duplicate content problem where search engines don’t know which version of the content to show. This article takes you through the use cases and the anti use cases.

Easiest correct example of using rel=canonical
Let’s assume you have two versions of the same page. Exactly, 100% the same content. They differ in that they’re in separate sections of your site and because of that the background color and the active menu item differ. That’s it. Both versions have been linked from other sites, the content itself is clearly valuable. Which version should a search engine show? Nobody knows.
For example’s sake, these are their URLs:
- http://example.com/wordpress/seo-plugin/
- http://example.com/wordpress/seo/seo-plugin/
This is what rel=canonical was invented for. Especially in a lot of e-commerce systems, this (unfortunately) happens fairly often, where a product has several different URLs depending on how you got there. You would apply rel=canonical in the following method:
- You pick one of your two pages as the canonical version. It should be the version you think is the most important one. If you don’t care, pick the one with the most links or visitors. If all of that’s equal: flip a coin. You need to choose.
- Add a rel=canonical link from the non-canonical page to the canonical one. So if we picked the shortest URL as our canonical URL, the other URL would link to the shortest URL like so in the
<head>section of the page:<link rel="canonical" href="http://example.com/wordpress/seo-plugin/">
That’s it. Nothing more, nothing less.
What this does technically is “merge” the two pages into one from a search engine’s perspective. It’s basically a sort of “soft redirect”, without redirecting the user. Links to both URLs now count for the single canonical version of the URL.
Should a page have a self referencing canonical?
In the example above, we make the non-canonical page link to the canonical version. But should a page set a rel canonical for itself? This is actually discussed every once in a while. I have a strong preference for having a canonical link element on every page. The reason is that most CMSes will allow URL parameters without changing the content. So these would show the same content:
- http://example.com/wordpress/seo-plugin/
- http://example.com/wordpress/seo-plugin/?isnt=it-awesome
- http://example.com/wordpress/seo-plugin/?cmpgn=twitter
- http://example.com/wordpress/seo-plugin/?cmpgn=facebook
etc. You see my point. If you don’t have a self referencing canonical on the page that points to the cleanest version of the URL, you risk being hit by this stuff. Even if you don’t do it yourself, someone else could do this to you and cause a duplicate content issue. So adding a self referencing canonical to URLs across your site is a good “defensive” SEO move. Luckily for you, our WordPress SEO plugin does this for you.
Cross domain canonical
Now, you might have the same piece of content on several domains. For instance, SearchEngineJournal regularly republishes articles from Yoast.com (with explicit permission). Look at every one of those articles and you’ll see a rel=canonical link point right back at our original article. This means all the links pointing at their version of the article count towards the ranking of our canonical version. They get to use our content to please their audience, we get a clear benefit from it too. Everybody wins.
The risk of faulty canonicals: common errors
There are a multitude of cases out there showing that a wrong rel=canonical implementation can lead to huge issues. I know of several sites that had the canonical on their homepage point to an article, and completely lost their homepage from the search results. There are more things you shouldn’t do with rel=canonical. Let me list the most important ones:
- Don’t canonicalize a paginated archive to page 1. Don’t add a rel=canonical on page 2 and further, search engines will actually not index the links on those deeper archive pages anymore…
- Make them 100% specific. For various reasons, a ton of sites use protocol relative links, meaning they leave the http / https bit from their URLs. Don’t do this for your canonicals. You have a preference. Show it.
- Base your canonical on the request URL. If you use variables like the domain or request URI used to access the current page while generating your canonical, you’re doing it wrong. Your content should be aware of its own URLs. Otherwise, you could still have the same piece of content on for instance example.com and www.example.com and have them both canonicalize to themselves.
- Multiple rel=canonical links on a page causing havoc. Sometimes a developer of a plugin or extensions thinks that he’s God’s greatest gift to mankind and he knows best how to add a canonical to the page. Sometimes, that developer is right. But since you can’t all be me, they’re inevitably wrong too sometimes. When we encounter this in WordPress plugins we try to reach out to the developer doing it and teach them not to, but it happens. And when it does, the results are wholly unpredictable.


Hi!
Thank you for the great article.
On our website we publish monthly reports and rankings, content isn’t always the same (it is not copy/paste) but generally has the same data in different order. There is also some comment attached to this (always different). We started to use rel=canonical and we can see great results from that+under the important phrases we always have the newest content in google. Isn’t it risky to do such things?
Cheers!
Why would you think that, Maciej?
Because the content itself is different.
Is there any plan to add rel=canonical to RSS/XML feeds so folks using your content will have that link back automatically? I suppose they’d still have to grab/use that field so it wouldn’t be foolproof. Can you speak to this?
It’s indeed an easy way to scrape content as well. Most of the times, this is done automatically – that is why we added the option to add a backlink to your content in RSS in our WordPress SEO plugin. That will make sure there is a link back to your site. Google will understand from the link that your website was the first to publish the content and will value your page much higher. It’s not the same, but basically the same indication to Google. Hope that helps!
Hi,
Ok I read this article but still not sure if to use rel=canonical for this scenario for wordpress:
Site has a page for Areas Covered like this:
http://sitename.com/areas-covered
On that page there is a list of different areas covered, where each area covered area has a link to a page that is almost identical to the areas covered page above, apart from the link is unique for each area covered and title pages title and unique area named is use.
Example of the other pages;
http://sitename.com/areas-covered-locationA
http://sitename.com/areas-covered-locationB
http://sitename.com/areas-covered-locationC
http://sitename.com/areas-covered-locationD
http://sitename.com/areas-covered-locationE
http://sitename.com/areas-covered-locationF
etc….etc….
These other pages all has the list of areas on it and all link to each other area page.
What is the correct procedure here?
As mentioned the pages are not identical because the is unique for each page i.e. LocationA and some words have been changed (but 99% is the same perhaps initially).
The plan would be to later add more unique location images for each of all the location pages, perhaps write some more unique things about the area for each area page, but generally they are almost identical.
Finally, if it’s a bad idea in the first place to have one similar page with a different URL for each area, then if ref=canocial is not the solution, would it be better to leave as-is or to hide all the separate area pages with a hide page plugin and stop robots from listing/indexing or caching these pages, only keeping the main page “Areas Covered”?
I would appreciate advice that is simple to understand for a non-expert.
Thanks
Anders
Anyone that can reply to this one please?
That’s plain duplicate content. If the majority of the actual content on a page is the same, it’s duplicate.
We recently published a very nice post on local SEO by Kris Jones you might want to check and please understand that the way you mention has past it’s expiration date. Might work for now, but usually short term. In the long run, local rankings are for local companies.
Hi,
Great post, very clear explanation.
I had read though that rel=canonical should not be used with hreflang tags (which I find is not so well taken into account by Google).
https://sites.google.com/site/webmasterhelpforum/en/faq-internationalisation (last QA in Rel Alternate Hreflang sectoin)
Thanks!
Christian
That’s not what it says, Christian :) You shouldn’t use canonical LIKE hreflang. In case of a multilingual website, the canonical goes to the page at hand (if appropriate) and the hreflang / alternates go to the other languages. So you can use these alongside each other, but don’t use canonical links for pointing to other languages to rank.
The exact phrase on the linked page is: “We recommend not using rel=canonical across different language or country versions.”
Your are right, Michiel. Thanks!
@Joost:
Thanks for keep sharing such useful and informative articles.
Indeed, re=canonical will be proven as A powerful tool if used wisely.
Many of SEO experts uses this not only to avoid DUPLICATE content issue but also COPYRIGHT issue as well. What’s your opinion on this?
Thanks,
–
Nigam Parikh
Mumbai,India.
Who says that accounts for copyright? I’d add a copyright statement in your footer / terms instead. Canonical isn’t anything that gives you rights..?
Hi, i recently changed the category names in the site and removed them from parent category.this created duplicate meta tags and title tags which i noticed in webmaster tools, Can I use redirection plugin ?or as said in the above post can I use settings in your plugin.if yes please specify where ? as I have no technical experience
The first is the new url and cat ; the second is the old one.
/gs1/epw-disabled-citizens-disability-certificate/932/
/p2/society/epw-disabled-citizens-disability-certificate/932/
I’m sorry, but we don’t do support via our comments. If you are a paying customer, please email plugin support. For free support, visit the WordPress.org forums.
its ok , thanks.
Until i read your article, i didn’t even know that they existed , it would be awesome to find a tool that checks for “compromised headers”, do you know if such a tool exist ?
I agree! Let me know when you find such a tool :-)
Good article, solid advice, except one thing. I respectfully disagree with usage of cross domain rel canonicals for site migrations / rebrands. I know you added a line saying 301s are always the best, but that section of the article might confuse a lot of non SEOs as to what to do during migrations. Site migrations are stressful enough with 301s, so using a combination of rel canonicals, then 301s is sending 2 signals rather than 1, which significantly increases the chances of error (both on Google’s part and your own). Maybe you know something I don’t? ;)
Awesome post Joost! Could have done with it last week whilst trying to justify a site restructure to a client.
Bookmarked
Very informative blog post. Everything makes sense though. Keep up the good work ahead too. Really appreciate the effort.
Great stuff reading Yoast. I really think you came into the depth with canonical. I did not know, that Facebook is giving credit to the original source though, so thank you for new information!
Hi Joost,
Nice read…. site looks great. thanks for the tips.
Hi Joost, have a question about canonicals and pagination. You say: “Don’t canonicalize a paginated archive to page 1. Don’t add a rel=canonical on page 2 and further, search engines will actually not index the links on those deeper archive pages anymore”.
I wonder if this is true. Apart from indexation, does a canonical tell the search engine a page should not be followed as well? It seems to me these are two different things.
I think if we can’t use canonical tag for either URL then it leads to cloaking. My websites many pages we detected as cloaked by several tools. After using canonical tag on archive everything goes right.
Thank for sharing
Hi Joost,
As always great article.
I am currently cleaning up a clients website where previously somebody did some real harm by “Using rel=canonical on not so similar pages” to boost link juice to a few pages on the website. Google penalized this website badly as they lost most of their traffic from search based terms.
Would you recommend a manual review by Google after cleaning up the website?
If you wish to add any further recommendations they will be appreciated.
Best regards,
James.
So if we put self-pointing canonical in every-page we are safe from duplicate content issue from those rouge requests? that’s nice was always having problem with that.
Good article. Curious about URL cloaking for FB and Twitter. We carefully canonical everything back to a single canonical URL across multiple sites where content might be viewed, but this creates problems with FB and Twitter sharing esp. when sites are branded or private labeled with their own FB campaigns. I saw on Stack that FB is okay with cloaking the canonical URL so it’s different (local) for FB and Twitter but constant for Google and other bots. Is this valid / white hat / the right way to handle this?