W3C Validation: Why you should care (and why not)
Every once in a while I get an email about W3C validation. The people emailing me either point me at the fact that my own site doesn’t validate correctly (thank Facebook for most of that), or they ask me whether I think W3C Validation is important. Most of them ask even more specifically: whether I think W3C validation is important for SEO.
Since I’ve taken it upon me to create articles for each question that I’ve answered more than once, let’s have a look at the pros and cons of W3C validation.
W3C validation & Web design
We’ll have a look at the pros and cons of W3C and web design first, before we dive into the SEO aspects. There are two types of validation errors in my opinion: there are errors that break or hamper rendering, which I’ll refer to as hard validation errors, and there are validation errors that don’t cause rendering issues, which I’ll refer to as soft validation errors.
An example of a hard validation error: an unclosed anchor – <a>
– tag. This can cause serious issues. Just about any unclosed element that should be closed and is not can cause you to look at styling issues for far too long. W3C validation is a very easy way to catch coding errors like that and thus it makes debugging and continued development of HTML a lot easier.
Having hundreds of validation errors is not usually a sign of code quality, so trying to keep the number of errors, hard or soft, down is usually a good idea. Whether you really have to bring it down to zero is another discussion, as over obsessing with certain issues can be a costly thing to do. Is your client, or are you, really willing to pay to get the page to validate when all that’s not working is the fact that you use an iframe
(something I’d call a soft validation error) when you’re required to use a Strict DTD? Some clients might be willing to pay for that, most I’ve worked with don’t care that much.
I’ve seen people do weird things to get their site to validate. Adding a JavaScript library to your site purely for the goal of getting your site to validate, for instance. Is that really a service to the website’s visitors? Adding 50kb of download for each visitor because you didn’t stand up to your client? I don’t think so.
W3C validation & SEO
So, on to the question that’s on a lot of people’s minds. While Googling for this subject I noticed loads of people have written about this topic, which made me decide just my own opinion wouldn’t suffice for this article; so you’ll get mine and the opinion of some of my friends. First of all, let me draw from my own experience.
My own opinion on Validation & SEO
While doing SEO for a major Dutch news site, I found out this site had its entire front-page show up blank in Google’s cache. The reason was that they were using an unclosed – and rather obscure – HTML tag, the XMP
tag. The XMP
tag is basically the same as the PRE
tag, but instead of rendering tags inside it, it outputs them. This XMP tag wasn’t closed properly, and thus, Google’s spider choked on it, causing their pages not to be indexed the way they should. So this error caused a browser, which is what Googlebot is in essence, not to render. That’s the kind of validation errors I think you should fix. For other validation errors, like the use of target=”_blank”
with a Strict document type, I do not care, and would not want my client to spend development time on my behalf fixing it.
I do know that I use the number of errors as a sign of code quality when doing quick scans, and poor code quality can very well be a reason for ranking badly. Do remember though that validating your site with the W3C doesn’t validate the semantics of your HTML.
But as I said: that’s just my simple opinion, let’s get some of my friends opinions:
Respected SEO’s on W3C Validation
First of all I asked Aaron Wall, of SEOBook fame, his answer was pretty direct:
“If you want to get links from web designers who charge high rates then W3C validation is important to SEO, otherwise it has little direct importance outside of ensuring proper rendering to end users. When one visits Amazon.com or Google or Yahoo! (or just about any billion Dollar+ internet company) they will find a website that doesn’t validate. Why is that?”
Is Aaron suggesting that validation being good for SEO is a web design industry scam because they need a better reason to get people to pay for writing valid HTML other than “we like it”? Yes, he is.
Next up was Greg Boser, his reply went along the same lines:
“We try to use frameworks that validate, but we don’t spend a ton of time trying to rework a plugin or widget that cause minor errors that aren’t site operation critical.”
Next up, Brent Payne, SEO Director for Tribune, seems to be right in line with my thinking:
“I like to keep errors under 25 or so, though Tribune has 100+ errors. Perfect code, I don’t think is necessary but you don’t want to have too malformed of HTML. Some say it is a ranking factor, I say you just don’t want to have stuff that is too unexpected for the bots.”
Dennis Goedegebuure, Senior Manager & Head of SEO at eBay Inc., said:
“It depends on the type of errors and how many, it all depends on whether the crawler can actually read the real content of the page.”
Lastly, I asked Jaimie Sirovich, the author of two technical SEO books, he said:
“As long as google.com doesn’t validate, I’d say no. They actually don’t quote attributes, I’d guess deliberately to shrink page size.”
(In fact, as Dennis pointed out, Google doesn’t even close the body
and html
tags). When asked what kind of issues he would fix, Jaimie was very resolute:
“Tag nesting, that’s about it. Just make sure it’s a tree.”
In other words: fix unclosed and or improperly nested tags, don’t bother about the rest.
The verdict on W3C Validation & SEO
Most SEO’s seem to agree that having code that isn’t properly nested or has big errors is bad for SEO. They all agree too that it’s not going to get you any better rankings when you really have valid HTML.
My final conclusion thus is: both for web design & SEO reasons, you’ll want to fix any and all blatant errors that might cause bad rendering or parser issues. Don’t worry about attributes that are not allowed though, nor about that one plugin using <b>
tags instead of <strong>
. It’s just not worth your time or money.
Don’t agree with me? Here’s a kicker: even Matt Cutts says there’s no bonus for validating, check out the following video:
Update, March 15th 2011, Matt did another video together with Danny Sullivan of SearchEngineLand:
Coming up next!
-
Event
WordCamp Netherlands 2024
November 29 - 30, 2024 Team Yoast is at Sponsoring WordCamp Netherlands 2024! Click through to see who will be there, what we will do, and more! See where you can find us next » -
SEO webinar
The SEO update by Yoast - October & November 2024 Edition
26 November 2024 Get expert analysis on the latest SEO news developments with Carolyn Shelby and Alex Moss. Join our upcoming update! 📺️ All Yoast SEO webinars »
Great article, the proven link between good clean validated coding and the effect on SEO, means i ensure all of my sites are compliant.
Surely it’s just good practice when building a website to ensure that it validates, validation doesn’t just help with search engines but all cross browser compatibility and accessibility.
Regards
Rob
I totally agree with everything you said above, I spend alot of time explaining to clients that a few minor errors will not affect the SEO of the website, for some reason we find when quoting on websites other web design companies have sold hard to fact they need valid html and avoid other companies that dont offer it, the top ranking website in the UK for ‘Web Design’ have alot of HTML errors which proves the point its not essential to SEO.
Although there are plenty of “hard errors” on the google page, HTML has always allowed you to omit closing tags and attibute quotes in some cases, so those are not necessarily errors.
The bigger message ought to be that you have to remember both human and machine consumers of Web pages, as well as Braille terminals and screen/text readers, and Web clients both current and future.
I think we agree on that one, but stuff like ARIA (mostly known for the aria-required attribute) has never validated, and yet is good for a lot of people. It’s those kind of non validation that I don’t care about. Of course I care about stuff that would break people’s browsers.
Thanks for this excellent article. Keep it up!
Great post! You just took a load of pressure off concerning validation. I have always thought i had to have 0 errors to look competent, but i gave up as it took too much time. I think as far as it doesnt interfere with rendering and user experience then it shouldnt be a bother.
Validation is a mixed bag. With our clients, we stress that the template code should validate. It is a good starting point for most. As for the code on the page, while we agree that there are bad errors and not so bad errors, it often costs more time and effort to have us review the code to determine which error is ok and which is not ok. If we tell clients to always validate code, then we do not have to check everything every time and in the long run it is cheaper. Otherwise you can get into discussions even arguments about what is good HTML and what is not. This is really true when there are multiple development teams working on a site.
Having pages validate has proven to be easier over the years than to evaluate each page on an individual basis as to whether the errors are serious enough to fix or not to fix.
– Roger
I really think people should stop giving Google such a hard time about it not validating. Although people think it is easy to just give quick fixes that return small size pages and validate, it’s literally impossible to factor in all of the different browsers / devices that Google’s website must see everyday. I know myself that my website doesn’t validate because of Internet Explorer 6.0 and I’m sure that there are other browsers throughout the world that are 10x worse! Validation is important, but I’d rather have my website look normal to a user than to be able to display a W3 Validated badge on my website.
-Andrew
The larger and more frequently updated a site, the less realistic valid markup is. Taking the time to scan the result of a validation is definitely worthwhile to avoid any major issues, as code should be well formed to avoid errors such as ‘lost’ content. However, as per the above we know that many large sites do not validate fully and suffer no apparent ill effect. Weighing the benefit to the business of having well formed code v. the cost to implement it is the method we use to assign priority. This often adds up to a poor investment of time and resources, versus more productive ways of spending your time.
In keeping with the title I do and don’t agree with you – I agree that there are some important parts to
validation – like unclosed tags etc that will cause majour issues and some less important ones
like for example I remember amazon affiliate links never used to validate – I ended up making php
refirects to keep their code off my page.
And it’s very true that validated your site does eliminate alot of cross browser issues and it’s cheaper
than paying for a service like browsercam.
However, all of these points are based on practicalities – I’m a strong beleiver in doing things the
right way and that’s why w3c is there right? It’s the industry recognised way of doing things: in this
sense I think it IS very important for sites to validate – and I think it goes back to the old days of
designers writing pure table layouts to get the desired asthetics, whilst sacrificing everything else.
Nice to see a post about something we DON’T have to do! Very refreshing and interesting. Thanks!
Even if validation is SEO hokum (i don’t believe it is), the fact remains that there is no good reason *not* to validate your code. It makes it much more maintainable in the future and bug fixing is infinitely easier.
One major point you forgot to address is browser scripting. Errors on the markup trigger the error correction mode on browsers, which can lead to unwanted DOM quirks. Most of the stuff people keep crying about browser scripting on the Web is closely related to markup issues. If you keep your markup valid, manipulating the DOM shouldn’t be too hard.
Also, once you get acquainted with W3 specs, you get to write valid HTML from starters. So, in the end of the day, you just have to fix minor issues that got by during the development process.
I agree with the points made in this article. Pay attention to validation, but don’t overdo it.
It’s also worth mentioning that the point of the W3C is to help push the web forward to ever greater things. Any SEO benefits are secondary to that purpose.
Interesting post. As a designer wasn’t 100% if validation did effect SEO but as a standard I make sure all of my sites are valid for both XHTML and CSS.
Although you say and everyone agrees that it is not that important SEO wise to a certain degree, I believe it is still good to show my customers that their site is valid unlike a lot of other sites out there.
It is more of a selling point than anything else, and it makes me feel good when I see the green box in my browser :)
Its not really that difficult to make a site compliant so why not do it. When designing from scratch that is, it can be tough when taking on a site just for SEO.
I really enjoyed this article and the attached video and quotes. What an interesting read!
I am one of these people who are really meticulous about code being valid, but now I may sleep easier on those odd jobs where W3C and my project don’t agree…
There’s NO EXCUSE for lazy mark-up.
I think accessibility is at least worth a mention. If you near validation you likely work with assistive devices such as screen readers and are being able to be rendered in different ways to suit user preferences – I believe Craig Mazur mentions rendering too. Coding to standards and attempting to achieve validation is good for accessibility and the overall user experience. For Government (which is where I work), it is fast becoming a legal requirement.
Semantic and properly nested markup can also be good for open data and content repurposing. Good for you when you want to migrate or do some data wrangling and good for others wanting to use your content. Just a couple of other reasons to follow standards and monitor validation.
I like your comments plugin for the threaded comments. Which one do you use?
this is my first visit to your blog! We are a group of volunteers and starting a new initiative in a community in the same niche. Your blog provided us valuable information to work on. You have done a marvellous job!
Let’s make sure that we do not give anyone the impression that validation does not matter at all. Validation is good to help assure that web pages render properly across multiple browsers as well as to help identify egregious coding errors that can prevent a spider from indexing a page.
In most cases, it won’t matter if the page validates. But in some, such as the XMP tag issue that you found, Yoast, it can matter. Over the years I have found similar errors that prevented the proper indexing of a page. The W3C validator easily identified those issues.
Put it this way: It is a good thing when a page does pass validation, but it probably will not help to improve your rankings unless you find a serious problem–and that is the reason to check the validation.
If you’re code is valid then you know that any rendering issues are down to the browser. If your code isn’t valid you’re making life hard for yourself and anyone else who has to work on the project. As you’ll never know if it’s the bad code or a browser bug, or possible the both combined!
But that’s mainly for the big errors. Using an un-escaped ‘&’ in a URL will cause the page not to validate but it wont actually break the page (unlike, as Yoast says, things like div’s and ‘a’ tags that will).
You should also consider that there is a legal requirement for websites to meet priority level 1 (so A standard) and part of this is the code has to be W3C valid within Europe. In the US you need to meet a subset of this called 501.
So I guess it depends on client requirements. But to my mind, it’s too easy, if you’re writing HTML from scratch to write valid code, so why not? I don’t find it takes longer and you know you’ve provided the client with a professional services.
Now with WP and some plugins this isn’t always possible of course…
If a client needs to use target=”_blank” and iFrames then change the doctype to transitional, problem solved!
I agree on you about this one: “If a client needs to use target=”_blank” and iFrames then change the doctype to transitional, problem solved!”.
Validation is all about good behaviour and eliminating rendering errors….
A great read, thanks for the info.
Thanks for this. I was getting paranoid about errors – had got mine down to 9, but CSS is a mess still.
Now I can sleep longer at night!
Good stuff!
Yes, get some sleep. And tell everyone to get sleep. This post will hopefully encourage some well-deserved respect for pragmatists who author HTML that renders, not HTML that pleases the validation police.
Amen to that :) (and testing something ;) )
Hey Juiced,
Agree with everything you’ve said above. The one other thing that I don’t see emphasised much about validation is simply cross-browser testing. I’ve found that having relatively few validation errors makes it significantly easier to ensure designs are cross-browser compatible without going through the usual world of hurt that such things can cause.
Cheers,
Alastair.
(P.S. Off-topic: check out your comments line breaks in FF latest, seems a few pixels too small)
Although this article is geared towards “should I or shouldn’t I?” it is also about “My Customer”.
Being a strong salesman is what keeps together what is important and what is not important. If you are trying to run a business and steer clients – you have to direct the horses to the proverbial water.
I’m glad this post came out to help solidify my standing on W3C Validation, but by no means have I ever even brought it up to a client who didn’t ask about it. We, as SEO and Web Design folks, must remember that we are the experts – even if we’re not as smart as Joost – we’re smarter than our customers.
Point them in the right direction, show them where the real money is, and don’t even mention W3C.
(By the way – Google doesn’t close some of their tags?! Brilliant!)
Its a subject I have wrestled for some time, … with until now that is.
As Dennis Goedegebuure pointed out when asked what kind of issues he would fix
“Tag nesting, that’s about it. Just make sure it’s a tree.”
Great article!
I can finally convince my customers to forget about less important stuff (like w3c validation) and concentrate on the stuff that make them money …
Jozef
nice article, what i understand that till now W3C is not playing a central role to rank any website better, but still we hope this would definitely be get weighed after some time when html 5 is spreading and new SEO techniques are being introduced.
Great post, i always validate my code as validation is best practice, but make difference between real errors and unsupported attributes for example. Thanks.
This is a great post, I know a few years ago I used to be obsessive about having no errors, but now I’ve realised there are better things I can do with my time. Don’t get me wrong I do make sure there are no hard errors in the code, but I’m happy to let a few soft errors pass through.
Nice to see thought that the SEO effect for valid code is minimal!
I hear this question a lot, and it is usually because a web design company has explained the benefits of valid code and how much they will happily do the work for. At least I can point people here for some seasoned feedback. Thanks Joost.
Karl
The problems that facebook gives you with validation can easliy be solved by shortening the facebook link with tinyURL.
But why do you care at all? If Google can parse the page there is no problem.
It’s not like food on your shirt on a first date. Google won’t care because they would have to write a deliberate algorithm to care; and it is in their interest to understand the page as browsers would as priority #1.
Browsers have no problem with Facebook links. So why should Google?
No, it might not be a problem for Google. But since it is such little effort to fix it, why should you not care? If you work for clients they might appreciate it if their site validates. No reason to let one invalid character in a Facebook link stop you from getting validated.
Very good post Yoast!
Hey Joost, what is your opinion on using some of the HTML5 that works in most browsers?
More specifically, do you see any reason to not use <!DOCTYPE html>, or omitting the text/css etc? Also, do you think it is still important to worry about IE6 if the audience isn’t mostly older folks or people who never update?
I use HTML5 on some sites in development and quite like it. Especially the video tag is a blessing for search marketers, although I’ve been doing some other tests that look quite promising too.
Whether or not you still worry about IE6 is mostly up to your web analytics: if it’s less than 1% you might start dropping support, if it’s more than that… Well I wouldn’t just break your site for them yet.
Just a small note about Dennis Goedegebuure’s observation that Google does not use closing body and html tags: this may likely be because HTML5 does not require closing tags for some elements, both body and html included.
HTML5 section 8.1.2.4 Optional tags
The reasons google does not use the < /body and < / html is to save money and page loading speed drops by up to 2ms. That equates to $hundreds of thousands in energy and page speed loading costs.
Good, informative article! Never got the fact that people would spent so much time trying to workaround the target=”_blank” attribute with some javascript, just to get the site validate. Thanks for the opinions of the SEO guys. :)
It’s all about the user experience. So if a browser renders your website as it should be, then there would”t be any positive or negative influence on your ranking if your site isn’t w3c compliant. So w3c compliancy is no factor in seo.
Excellent article. There has to be a balance. Blindly following the obsessive need to 100% valid is just not necessary.
Matt also recently at a conference that validation as a ranking factor would be stupid – throwing out over 40% of websites on the web because the HTML wasn’t perfect would be stupid at best.