Why you should not use XML Sitemaps

Update: This article is published in 2007. Nowadays XML sitemaps have found there way into our WordPress SEO plugin and we do recommend using them for any (larger) site.

First of all: I do want you to use Google Webmaster Tools. It’s an incredible useful toolbox, so please, please use it. If you’re not using it at the moment, check out my post on how to use Google Webmaster Tools. I do think though, that XML Sitemaps (whether it be Google sitemaps or otherwise),  are not doing you any good in most cases…

My arguments for it aren’t new. Dave has made it very clear why an XML sitemap is BAD. It’s basically throwing away very valuable data… Up until now, you don’t have a real problem when you’re using an XML sitemap, as Google will throw badly linked pages into the supplemental index, you can check that, and see where you’re going wrong. The scary thing is though, that Google has already shut down the query with which you could request all supplemental pages. A query which, as Rand pointed out while requesting the return of the supplemental query, was very valuable.

Google still shows the “Supplemental Result” tag though, so there are still ways to find out if a page is linked well enough… However, if you read the comment thread on the article on SEOmoz above, you will see Matt Cutts saying:

As I mentioned at SMX Seattle, my personal preference would be to drop the “Supplemental Result” tag altogether.

Now if that happens (and with Matt saying this, there’s a good chance of that), how will you know which of the pages you’ve pushed into Google’s index through it’s own “shortcut”, are not receiving the linklove they need?

In white hat SEO, there are hardly any shortcuts to take, and you should think at least twice before taking one… The XML sitemaps might get you indexed faster, but they won’t make you rank any better. In my opinion, there’s only one reason to use XML sitemaps, and that’s if you’ve got a news site and need your news in Google’s index FAST to get the maximum amount of traffic out of it. This will only work if your site has the authority one needs to make sure a new article will start ranking immediately, and it would mean you should only put your news articles into the XML sitemap.

For any other purpose, I would suggest to throw away your XML sitemap now, and start using the data that Google gives you for free: check which pages get indexed when, and try to find the patterns in it.

Tags: ,


Yoast.com runs on the Genesis Framework

Genesis theme frameworkThe Genesis Framework empowers you to quickly and easily build incredible websites with WordPress. Whether you're a novice or advanced developer, Genesis provides you with the secure and search-engine-optimized foundation that takes WordPress to places you never thought it could go.

Read our Genesis review or get Genesis now!

33 Responses

  1. JohnBy John on 29 July, 2007

    At least one of the supplemental-only queries still works :-).

    Here’s my slightly skewed (I made the GSiteCrawler freeware sitemap generator) take:

    - If your site is fighting with regards to the number of URLs, number of URLs indexed and the value spread over them, you WILL have to be careful about the URLs that you list in your sitemap file. However, if you have a well defined set of URLs (either limited by design or limited by the robots.txt), the sitemap file will just include the URLs that Google could find anyway. Where’s the problem with that? If you have more URLs in your sitemap file than Google can “legally” crawl, then Google will ignore those anyway. I don’t see how that could be a reason against Sitemaps?

    - Like you mentioned, Sitemaps can really push new URLs quicker into the index. However, it’s not just for news sites! Imagine you have a site that has 1000 URLs and 10 of those change (new products, etc): with a XML Sitemap you can direct the crawlers DIRECTLY to those URLs. Without the sitemap, they would have to crawl the whole site before stumbling upon those 10 URLs — and if your site has marginal value, it will take a looong time for all the URLs to get crawled (if they’re low-value/priority).

    - This doesn’t really apply to all you webmaster stars :-) — but 99.9% of the webmasters out there haven’t and will likely never reach your level. When generating a sitemap file (provided it’s not though a plugin but rather through an external crawler) those webmasters will see their website through the eyes of a crawler for the first time. You would not believe how many people I have who contact me to tell me that my crawler is broken: it only finds 1-3 pages. Well, duh! No wonder, if your site uses javascript navigation, is just a frame-front to some obscure free-hosting URL, is filled with cross-domain links, etc etc etc. I don’t mind the mails, but it shows how far the general public is from understanding even the most basic issues regarding crawling and indexing of their sites.

    Did you know *all* blogger blogs use a sitemap by default? (Atom/RSS feed as a sitemap, specified in the robots.txt for auto-discovery) I don’t think you can even turn that off :-).

  2. André ScholtenBy André Scholten on 30 July, 2007

    There is a good reason for using them: when you move a website to an other domain or when you make an update with a new url structure.

    With a XML sitemap you could submit all your old url’s so they will be indexed soon and the search engines will get the 301′s. This way small and unimportant pages can be transferred very quick.

  3. Myron RosmarinBy Myron Rosmarin on 31 July, 2007

    We have a million pages that are primarily discoverable through a search front-end and so aren’t crawlable that way. We’ve also built a browsable directory to get to these pages via crawling but that is still a mediocre solution given how deep the crawlers have to dig.

    I have found that the XML sitemaps accomplish two things for us: 1) we can provide the crawlers with the exact location of each of these pages and 2) we can clean up the URLs and make them very SEO friendly — much more so than what we had to launch with.

    I can’t think of a better mechanism for aiding discoverability for these pages than the XML sitemaps but I am certainly open to ideas :-)

  4. Stephen ParsonsBy Stephen Parsons on 31 July, 2007

    We use a content management system on many of our sites that uses only a couple of *.asp files to pull content from a database. For example, a page containing contact info might be http://www.mydomain.com/default2.asp?page_id=123 while another with site news might be http://www.mydomain.com/default2.asp?page_id=456. Note that in both cases the filename is still default2.asp with a parameter passed as a querystring.

    Can XML sitemaps help us to get pages of that content indexed or is there a better way to get Search Engines to find important content there?

  5. Tadeusz SzewczykBy Tadeusz Szewczyk on 31 July, 2007

    Please note that the supplemental query still works, even a new one was added recently. So the rumor of it no longer functioning is not true.
    Just check out my examples:
    http://seo2.0.onreact.com/new-way-of-checking-supplemental-results

  6. deInternetMarketeerBy deInternetMarketeer on 31 July, 2007

    Congrats Joost,

    Good article. XML sitemaps are in most cases a wast of time.
    And if it helps someone the that someone is called Google. They make enough money already to pay their own bills and do their work themselves.

    Never understood why people were so happy with xml sitemaps, still don’t understand why they love GWT.

    All this google tools are made to help Google, not to help you.

  7. Luc de MedtsBy Luc de Medts on 31 July, 2007

    I had a very bad experience with XML sitemaps in the past. A couple of well designed and optimised web sites (completely written in html and with text navigation) were submitted to Google. I added a XML sitemaps to them and waited… For a very long time. At last I removed the XML file from the directory and a few weeks later all sites showed up. They’re still climbing every month in the organic search results.
    Therefore I strongly believe XML sitemaps are only an issue when your site has a bad (read not search engine friendly) design. When using javascript navigation or flash navigation in a site maybe the XML sitemaps is the last hope to get your site indexed. For other correctly built sites I think it is rather a handicap…
    I fully agree Google has other valuable webmaster tools to use. But let’s forget the whole XML sitemaps fairytale.

  8. ron angelBy ron angel on 31 July, 2007

    If people should not you sitemaps as you say why do google promote them? I have xml ror & rss on my site just in case
    my site logs show they are looked at.
    http://www.ssrichardmontgomery.com

  9. deInternetMarketeerBy deInternetMarketeer on 31 July, 2007

    So you think Google makes tools to please you?
    Why should a company do that without generating a lot of usefull info for them out of it?

  10. ron angelBy ron angel on 31 July, 2007

    I do not care what info they get out of them as long as it lets more people know the site is there. no such thing as bad publicity for a web site with normal legal content. Have a look at the site yourself if you have the time, & while you are looking make me & google happy by clicking on any of their adds that you find of interest!(grin)

  11. deInternetMarketeerBy deInternetMarketeer on 31 July, 2007

    Perhaps you should have to care …

    Btw, you are violating Google Adsense TOS with your comment.

  12. ron angelBy ron angel on 31 July, 2007

    Btw, you are violating Google Adsense TOS with your comment
    ————
    I don’t think so, notice the wording: “that you find of interest” all I am doing is pointing out the adds are there, It is up to you to click on any of them ONLY IF you find of interest which is what I think they intended. entirely up to you which add if any you do or don’t click on. Defence rests….

  13. Joost de ValkBy Joost de Valk on 31 July, 2007

    @John: thx dude ;)

    @Andre: though I agree that you COULD use sitemaps that way, I see no reason why you should… If your site is well optimized, Google will spider the new pages very rapidly anyway, giving you valuable data in the process…

    @Myron Rosmarin: your story is exactly the reason why I hate google sitemaps… They will get your pages indexed, but NOT ranking… For a page to be able to rank, it HAS to be linked from and discoverable through the web somewhere…

    @Stephen: if you have a proper site structure, in which pages link to each other, all these pages will be found and will rank. Feel free to email me if you need help with that (goes for everyone of course)

    @Tadeus: you might consider as to whether how wise it was of you to blog that…

    @deInternetMarketeer: GWT DOES help you as an internet marketeer, if you can’t understand how it helps you, and not only Google, I think you should read the article i mentioned. And BTW, yes they make tools to please me. That’s called marketing.

    @Ron: of course they are looked at, you might consider though why you would need them… After all, isn’t Google the one that has it’s first rule: make pages for people, not for search engines? The XML sitemap is more against that rule than anything…

  14. deInternetMarketeerBy deInternetMarketeer on 31 July, 2007

    GWT probably helps Joost, otherwise people would not use it.
    But a year ago everyone thought the same about xml sitemaps.

    I’m not using it and will not do unless it could be helpfull for the site i’m working on.
    But that has to be a site i wouldn’t want to work on :)

    Yes Google does good marketing.
    Only they have targets like advertisers and not webmasters when it comes to marketing.
    But we wouldn’t agree on that so no need to discuss that further.

    Will read the article you writed about it in a minute!

  15. Myron RosmarinBy Myron Rosmarin on 31 July, 2007

    @Myron Rosmarin: your story is exactly the reason why I hate google sitemaps… They will get your pages indexed, but NOT ranking… For a page to be able to rank, it HAS to be linked from and discoverable through the web somewhere…

    That is simply not a true statement. The content in those pages do very well for queries in the tail and are responsible for a considerable amount of organic traffic from search engines. To say otherwise is misrepresenting the value of sitemaps.

  16. Joost de ValkBy Joost de Valk on 31 July, 2007

    @Myron Rosmarin: nope, it just means that those pages would have been indexed without Google Sitemaps as well. Had that happened, you would have known which pages Google can’t find, because you knew which pages weren’t indexed, and you could have adapted your internal linking structure to that.

  17. ron angelBy ron angel on 31 July, 2007

    I would assume that as my site all htm pages are linked via a
    person readable sitmap.html on myron rosmains pages so they would all be found & indexed by search engines anyway.(if not this is a good idea for everybody to do unless, somebody knows better)…

  18. Joost de ValkBy Joost de Valk on 31 July, 2007

    @ron: you could do so much better than that… I’ll write an article soon about creating site structures, it’s clear that people need more help in that :)

  19. deInternetMarketeerBy deInternetMarketeer on 31 July, 2007

    Off course and if you have a lot pages then you split up your sitemaps and bring some internal linking structure in the sitemaps.

    The time you save on using an xml sitemap you can use to make sitemaps that are really helpfull for your rankings.
    Which isn’t the case with xml sitemaps.

  20. Myron RosmarinBy Myron Rosmarin on 31 July, 2007

    @Myron Rosmarin: nope, it just means that those pages would have been indexed without Google Sitemaps as well.

    We have millions of pages that are discoverable via crawling or via the XML sitemaps. There is a huge improvement in the quality of the URL strings in the sitemaps. I can see from these URL strings in the Google search results exactly how the crawler found the page. Sitemaps are winning that contest by a lot. :-)

  21. Joost de ValkBy Joost de Valk on 31 July, 2007

    @Myron: are you saying the pages on your site are discoverable under multiple URL’s?

  22. Myron RosmarinBy Myron Rosmarin on 31 July, 2007

    @Myron: are you saying the pages on your site are discoverable under multiple URL’s?

    Indeed. But that’s common for search driven data. I don’t get too concerned about this since the length of the crawl path to these pages makes it less likely that crawlers will find all of it (or even most of it). Once again, I stand behind the sitemap which makes discovery of the content infinitely easier.

  23. Bespeckled SEOBy Bespeckled SEO on 2 August, 2007

    Joost,

    Looks like the supplemental tag has been definitely removed. Good call!

  24. Joost de ValkBy Joost de Valk on 2 August, 2007

    Yeah, it looks like it indeed ;)

  25. GerbenBy Gerben on 14 August, 2007

    On the other hand (let’s bring this topic up again), how bad is it to have pages in the supplemental results?

    1 of my sites has almost 40% in the supplementel (in percentage a lot) but it’s not bad for the average ranking of my site..the pages wich i want to rank, are still ranked…only the supplemental results not…but i don’t need to rank them….

  26. ChandraPrakash LoonkerBy ChandraPrakash Loonker on 24 September, 2007

    This is so strange. Google and Yahoo itself are promoting xml sitemaps and then they themselves rank you lower if you give xml sitemap. I have both xml sitemap and php sitemap. When i submitted php sitemap, google gave errors. So now to believe this article or to write to google to start accepting php sitemaps?

  27. Joost de ValkBy Joost de Valk on 24 September, 2007

    @ChandraPrakash: just start making your sites in such a way that search engine can find their way by themselves, and stop throwing away your spider data by submitting sitemaps.

  28. ToddBy Todd on 29 October, 2009

    Hi Joost,

    Thank you for this information. I hope you won’t mind answering two short questions: After removing
    the XML Sitemaps plug-in from wordpress, and removing the two “sitemap” files that were written to the root directory (where our blog is located) – are there any other files to remove and do we need to “notify” google, etc. of the removal?

    Thanks again!
    -Todd

Trackbacks

  1. [...] Why you should not use XML Sitemaps – SEO Blog – Joost de Valk First of all: I do want you to use Google Webmaster Tools. It’s an incredible useful toolbox, so please, please use it. If you’re not using it at the moment, check out my post on how to use Google Webmaster Tools. I do think though, that XML Sitemaps (tags: google web-developer web-developer/tools web-developer/promotion web-developer/xml wev-developer google/sitemaps) [...]

  2. [...] Why you should not use XML Sitemaps “XML Sitemaps (whether it be Google sitemaps or otherwise),  are not doing you any good in most cases […] How will you know which of the pages you’ve pushed into Google’s index through it’s own ’shortcut’, are not receiving the linklove they need?” [...]