If your website means anything for your business, you should not forget to schedule time to do maintenance on it. Therefore we regularly write about the things you should do to keep your site in shape. In this post we’ll write about the most basic of all: checking for 404 error pages.
Note: this post does not cover the required elements of a good 404 page, we do have a post on that, though: Thoughts on 404 error pages.
404 error pages and broken links
One of the most annoying things that can happen to a visitor is to hit a 404 on your website. Search engine spiders tend to not like such errors much either. Annoyingly search engines often encounter other types of 404s than your visitors, which is why the first section of this post is split in two:
1. Measuring visitor 404 error pages
If you use the MonsterInsights plugin, it’ll automatically tag your 404 pages for you. So then, if you go into your Google Analytics account and go to Behavior → Site Content → Content Drilldown and search for “404.html”, you’ll find a ton of info about your 404s (click for larger version):
You’ll see URLs like this:
This tells you two things:
- The 404 URL was
/wordpress/plugin/local-seo/(it lacks an s after plugin)
- It was linked to from our WordPress SEO article.
Using this info, you can fix the 404 and go into the article and fix the link.
As you can see from the above screenshot, we actually get 404s too. We break things all the time because our website is a constant work in progress. I don’t think anyone can totally prevent creating 404 errors on his or her site. Making sure that you notice it when you’re breaking things is a good way of not looking stupid for too long though.
2. Measuring bot 404 error pages
Next to 404s for visitors, search engines will also encounter 404s on your site that can be quite different. You can find the 404s that search engine spiders encounter by logging into their respective Webmaster Tools programs. There are three webmaster tools programs that can give you indexation reports, in which they tell you which 404s they encountered:
- Bing Webmaster Tools under Reports & Data → Crawl Information
- Google Search Console under Crawl → Crawl Errors
- Yandex Webmaster under Indexing → Excluded Pages → HTTP Status: Not Found (404)
One of the weird things you’ll find if you’re looking into those Webmaster Tools programs is that search engine spiders can encounter 404s that normal users would never get to. This is because a search spider will crawl just about anything on most sites, so even links that are hidden will be followed.
If you’re serious about website maintenance, you might want to find these 404s before search engines encounter them. In that case, spidering your site with a tool like Xenu or (our favorite) Screaming Frog will give you a lot of insight. These tools are built specifically to behave just like search engine spiders and will therefore help you find a lot of issues.
Fixing 404 errors
Now that we’ve found all these 404 errors, it’s time to fix them. If you know what caused the 404 and you can fix the link that caused it, it’s best to do that. This will be the best indication of quality of your site for both users and search engines.
As search engines will continue to hit those URLs for quite a while, it actually makes sense to still redirect those faulty URLs to the right pages as well. To create those redirects, there are several things you can do:
- Create them manually in your .htaccess or your NGINX server config
While this is not for the faint of heart, it’s often one of the fastest methods available if you have the know-how and the access to do it.
- Create them with a redirect plugin
There are several redirect plugins on the market, the most well known one being Redirection. This is a lot easier but has the disadvantage of being a lot slower as to do the redirect, the entire WordPress install has to load first. This usually adds half a second to a second to the load time for that particular redirect.
- Create them with our Yoast SEO Premium plugin
Our Yoast SEO Premium plugin has a redirect module that allows you to make redirects with the ease of the WordPress interface, but also allows you to save those to your .htaccess file or a NGINX include file, so they get executed with the speed of the first option above. It actually also has another few nifty options: you can get the 404 errors from Google Search Console straight in your WordPress install and redirect them straight away, and it’ll add a nice button in your WordPress toolbar if you’re on a 404 page:
Check for image / embed errors
If you’d look at your server logs, you’d get 404 errors of a different type too: 404s for broken images or broken video embeds. You might also have errors that don’t show up in your logs, like broken YouTube video embeds. They don’t cause the entire page not to work, but they do look sloppy. These types of errors are harder to find because webmaster tools programs don’t report them as reliable and you can’t track them with something like Google Analytics either.
The easiest method to find these broken images and embeds is using one of the aforementioned spiders. Screaming Frog in particular is very good at finding broken images. Another method is to check your server logs and go through them searching for a combination of 404 and “.jpg” and “.png”.
How often should you check for 404 errors?
You should be checking your 404s at least once every month and on a bigger site, every week. It doesn’t really depend on how much visitors you have but much more on how much content you have and create and how much can go wrong because of that. The first time you start looking into and trying to fix your 404 error pages you might find out that there are a lot of them and it can take quite a bit of time… Try to make it a habit so you’ll at least find the important ones quickly.
Read more: Clean up old posts and pages »