This is already the third post in our Google Search Console series. We’ve written about the Search Appearance section and the Search Traffic section of Google Search Console. So if you jumped in here, and want to start at the beginning, please read those posts first. Today we’ll be going into the Google Index section, which obviously gives you some insight into how your website is being indexed in Google.
On the 20th of May 2015, Google announced that the name Google Webmaster Tools did not cover the user base of the tool anymore. Only a part of the user base could indeed be called ‘webmaster’. For that reason, Google renamed the tool Google Search Console (GSC)
Other posts in this series
The Index Status shows you how many URLs of your website have been found and added to Google’s index:
This can give you a good idea of how your site is doing in the index. If you see this line dropping, for instance, you’d know there’s an issue. Basically any major and unexpected change to this graph should be something you look into.
Actually, the “Advanced” tab gives you just a bit more insight into how all your indexed pages are divided:
As you can see, this shows you how many of your pages are being blocked by your robots.txt as well. And you can also see how many of your pages have been removed from the index, but more on that in the next chapter.
There’s something else this graph makes clear. As of March 9th of last year (at the “update” line) Google Search Console shows the data separately for both HTTP and HTTPS websites. This means that if you moved your site from HTTP to HTTPS since then, you’ll need to add your site again, using the red “Add a site” button. Then, fill in the entire URL, including the HTTP or HTTPS part:
Interpretation of the Index Status
There’s a few things you should always look for when checking out your Index Status:
- Your indexed pages should be a steadily increasing number. This tells you two things: Google can index your site and you keep your site ‘alive’ by adding content;
- Sudden drops in the graph. This means Google is having trouble accessing (all of) your website. Something is blocking Google out, whether it’s robots.txt changes or a server that’s down: you need to look into it! This could also have to do with the separate HTTP and HTTPS tracking I mentioned above;
- Sudden (and unexpected) spikes in the graph. This could be an issue with duplicate content (such as both www and non-www, wrong canonicals, etc.), automatically generated pages, or even hacks.
The Content Keywords area gives you a pretty good idea of what keywords are important for your website. When you click on the Content Keywords menu item, it’ll give you a nice list of keywords right away:
These keywords are found on your website by Google. This does not mean you’re ranking for these keywords, it just means they’re the most relevant keywords for your site according to Google. You can also extend this list to 200 items, so it’ll give you a pretty broad idea.
This actually tells us a few things about your site. It shows you what Google thinks is most important for your website. Does this align with your own ideas of what your website’s about? For instance, if you find any keywords here that you didn’t expect, such as “Viagra” or “payday loan”, this could mean that your site has been hacked. And next to that, if you’d expect any keywords that you can’t find in this list, there’s a few things you can check:
- Your robots.txt might be blocking the page(s) that contain the keyword(s) you’re expecting;
- The page containing the keyword might not be old enough yet for Google to have crawled it;
- Google excludes keywords they consider boilerplate or common words from this list. What they’d consider boilerplate or common differs per site.