Today my buddy Sander pointed out that he suddenly had pages showing as
noindex,nofollow when he ran a spider across a site. A bit more researching learned us that WordPress automatically adds a
noindex, nofollow robots meta tag to each URL that has
?replytocom in it. At first I (wrongly) thought this was new to WordPress 3.5, but it turns out to be the default behavior for quite a while already. All the more reason to tell you about it:
What are these
This would force reload the page and give you the option to reply to the comment with ID 1. I absolutely hate that fallback link. On a site like this one, with often over a hundred comments on a post, it means there are 100 links pointing to that same article, causing a lot of crawling that’s totally unneeded. For this reason I added the option in my SEO plugin to remove it, which you’ll find under SEO → Permalinks:
So what does this noindex,nofollow do?
Unfortunately, the robots meta tag WordPress adds essentially makes every URL with
?replytocom in it a dead end street. Because of the
nofollow bit of the robots meta tag it adds, if say, Mashable would link to a URL with
replytocom in it, my site wouldn’t actually benefit from that link. Doing nothing is much better: the
rel="canonical" link element on the page, that points to the clean version, would tell search engines to use that clean version.
This is the reason why, when I found out, I immediately released version 1.3.3 of my WordPress SEO plugin that removes that
noindex,nofollow line. I’ve also opened a trac ticket, we’ll see what happens with that. For now, my advice is: upgrade to 1.3.3 and check that remove