Why you should not use XML Sitemaps

First of all: I do want you to use Google Webmaster Tools. It's an incredible useful toolbox, so please, please use it. If you're not using it at the moment, check out my post on how to use Google Webmaster Tools. I do think though, that XML Sitemaps (whether it be Google sitemaps or otherwise),  are not doing you any good in most cases...

My arguments for it aren't new. Dave has made it very clear why an XML sitemap is BAD. It's basically throwing away very valuable data... Up until now, you don't have a real problem when you're using an XML sitemap, as Google will throw badly linked pages into the supplemental index, you can check that, and see where you're going wrong. The scary thing is though, that Google has already shut down the query with which you could request all supplemental pages. A query which, as Rand pointed out while requesting the return of the supplemental query, was very valuable.

Google still shows the "Supplemental Result" tag though, so there are still ways to find out if a page is linked well enough... However, if you read the comment thread on the article on SEOmoz above, you will see Matt Cutts saying:

As I mentioned at SMX Seattle, my personal preference would be to drop the "Supplemental Result" tag altogether.

Now if that happens (and with Matt saying this, there's a good chance of that), how will you know which of the pages you've pushed into Google's index through it's own "shortcut", are not receiving the linklove they need?

In white hat SEO, there are hardly any shortcuts to take, and you should think at least twice before taking one... The XML sitemaps might get you indexed faster, but they won't make you rank any better. In my opinion, there's only one reason to use XML sitemaps, and that's if you've got a news site and need your news in Google's index FAST to get the maximum amount of traffic out of it. This will only work if your site has the authority one needs to make sure a new article will start ranking immediately, and it would mean you should only put your news articles into the XML sitemap.

For any other purpose, I would suggest to throw away your XML sitemap now, and start using the data that Google gives you for free: check which pages get indexed when, and try to find the patterns in it.

28 Responses to “Why you should not use XML Sitemaps”

  1. At least one of the supplemental-only queries still works :-).

    Here's my slightly skewed (I made the GSiteCrawler freeware sitemap generator) take:

    - If your site is fighting with regards to the number of URLs, number of URLs indexed and the value spread over them, you WILL have to be careful about the URLs that you list in your sitemap file. However, if you have a well defined set of URLs (either limited by design or limited by the robots.txt), the sitemap file will just include the URLs that Google could find anyway. Where's the problem with that? If you have more URLs in your sitemap file than Google can "legally" crawl, then Google will ignore those anyway. I don't see how that could be a reason against Sitemaps?

    - Like you mentioned, Sitemaps can really push new URLs quicker into the index. However, it's not just for news sites! Imagine you have a site that has 1000 URLs and 10 of those change (new products, etc): with a XML Sitemap you can direct the crawlers DIRECTLY to those URLs. Without the sitemap, they would have to crawl the whole site before stumbling upon those 10 URLs --- and if your site has marginal value, it will take a looong time for all the URLs to get crawled (if they're low-value/priority).

    - This doesn't really apply to all you webmaster stars :-) -- but 99.9% of the webmasters out there haven't and will likely never reach your level. When generating a sitemap file (provided it's not though a plugin but rather through an external crawler) those webmasters will see their website through the eyes of a crawler for the first time. You would not believe how many people I have who contact me to tell me that my crawler is broken: it only finds 1-3 pages. Well, duh! No wonder, if your site uses javascript navigation, is just a frame-front to some obscure free-hosting URL, is filled with cross-domain links, etc etc etc. I don't mind the mails, but it shows how far the general public is from understanding even the most basic issues regarding crawling and indexing of their sites.

    Did you know *all* blogger blogs use a sitemap by default? (Atom/RSS feed as a sitemap, specified in the robots.txt for auto-discovery) I don't think you can even turn that off :-).

  2. There is a good reason for using them: when you move a website to an other domain or when you make an update with a new url structure.

    With a XML sitemap you could submit all your old url's so they will be indexed soon and the search engines will get the 301's. This way small and unimportant pages can be transferred very quick.

  3. We have a million pages that are primarily discoverable through a search front-end and so aren't crawlable that way. We've also built a browsable directory to get to these pages via crawling but that is still a mediocre solution given how deep the crawlers have to dig.

    I have found that the XML sitemaps accomplish two things for us: 1) we can provide the crawlers with the exact location of each of these pages and 2) we can clean up the URLs and make them very SEO friendly -- much more so than what we had to launch with.

    I can't think of a better mechanism for aiding discoverability for these pages than the XML sitemaps but I am certainly open to ideas :-)

  4. We use a content management system on many of our sites that uses only a couple of *.asp files to pull content from a database. For example, a page containing contact info might be http://www.mydomain.com/default2.asp?page_id=123 while another with site news might be http://www.mydomain.com/default2.asp?page_id=456. Note that in both cases the filename is still default2.asp with a parameter passed as a querystring.

    Can XML sitemaps help us to get pages of that content indexed or is there a better way to get Search Engines to find important content there?

  5. Please note that the supplemental query still works, even a new one was added recently. So the rumor of it no longer functioning is not true.
    Just check out my examples:
    http://seo2.0.onreact.com/new-way-of-checking-supplemental-results

  6. Congrats Joost,

    Good article. XML sitemaps are in most cases a wast of time.
    And if it helps someone the that someone is called Google. They make enough money already to pay their own bills and do their work themselves.

    Never understood why people were so happy with xml sitemaps, still don't understand why they love GWT.

    All this google tools are made to help Google, not to help you.

  7. I had a very bad experience with XML sitemaps in the past. A couple of well designed and optimised web sites (completely written in html and with text navigation) were submitted to Google. I added a XML sitemaps to them and waited... For a very long time. At last I removed the XML file from the directory and a few weeks later all sites showed up. They're still climbing every month in the organic search results.
    Therefore I strongly believe XML sitemaps are only an issue when your site has a bad (read not search engine friendly) design. When using javascript navigation or flash navigation in a site maybe the XML sitemaps is the last hope to get your site indexed. For other correctly built sites I think it is rather a handicap...
    I fully agree Google has other valuable webmaster tools to use. But let's forget the whole XML sitemaps fairytale.

  8. If people should not you sitemaps as you say why do google promote them? I have xml ror & rss on my site just in case
    my site logs show they are looked at.
    http://www.ssrichardmontgomery.com

  9. So you think Google makes tools to please you?
    Why should a company do that without generating a lot of usefull info for them out of it?

  10. I do not care what info they get out of them as long as it lets more people know the site is there. no such thing as bad publicity for a web site with normal legal content. Have a look at the site yourself if you have the time, & while you are looking make me & google happy by clicking on any of their adds that you find of interest!(grin)

  11. Perhaps you should have to care ...

    Btw, you are violating Google Adsense TOS with your comment.

  12. Btw, you are violating Google Adsense TOS with your comment
    ------------
    I don't think so, notice the wording: "that you find of interest" all I am doing is pointing out the adds are there, It is up to you to click on any of them ONLY IF you find of interest which is what I think they intended. entirely up to you which add if any you do or don't click on. Defence rests....

  13. GWT probably helps Joost, otherwise people would not use it.
    But a year ago everyone thought the same about xml sitemaps.

    I'm not using it and will not do unless it could be helpfull for the site i'm working on.
    But that has to be a site i wouldn't want to work on :)

    Yes Google does good marketing.
    Only they have targets like advertisers and not webmasters when it comes to marketing.
    But we wouldn't agree on that so no need to discuss that further.

    Will read the article you writed about it in a minute!

  14. @Myron Rosmarin: your story is exactly the reason why I hate google sitemaps… They will get your pages indexed, but NOT ranking… For a page to be able to rank, it HAS to be linked from and discoverable through the web somewhere…

    That is simply not a true statement. The content in those pages do very well for queries in the tail and are responsible for a considerable amount of organic traffic from search engines. To say otherwise is misrepresenting the value of sitemaps.

  15. I would assume that as my site all htm pages are linked via a
    person readable sitmap.html on myron rosmains pages so they would all be found & indexed by search engines anyway.(if not this is a good idea for everybody to do unless, somebody knows better)...

  16. Off course and if you have a lot pages then you split up your sitemaps and bring some internal linking structure in the sitemaps.

    The time you save on using an xml sitemap you can use to make sitemaps that are really helpfull for your rankings.
    Which isn't the case with xml sitemaps.

  17. @Myron Rosmarin: nope, it just means that those pages would have been indexed without Google Sitemaps as well.

    We have millions of pages that are discoverable via crawling or via the XML sitemaps. There is a huge improvement in the quality of the URL strings in the sitemaps. I can see from these URL strings in the Google search results exactly how the crawler found the page. Sitemaps are winning that contest by a lot. :-)

  18. @Myron: are you saying the pages on your site are discoverable under multiple URL’s?

    Indeed. But that's common for search driven data. I don't get too concerned about this since the length of the crawl path to these pages makes it less likely that crawlers will find all of it (or even most of it). Once again, I stand behind the sitemap which makes discovery of the content infinitely easier.

  19. Joost,

    Looks like the supplemental tag has been definitely removed. Good call!

  20. On the other hand (let's bring this topic up again), how bad is it to have pages in the supplemental results?

    1 of my sites has almost 40% in the supplementel (in percentage a lot) but it's not bad for the average ranking of my site..the pages wich i want to rank, are still ranked...only the supplemental results not...but i don't need to rank them....

  21. This is so strange. Google and Yahoo itself are promoting xml sitemaps and then they themselves rank you lower if you give xml sitemap. I have both xml sitemap and php sitemap. When i submitted php sitemap, google gave errors. So now to believe this article or to write to google to start accepting php sitemaps?

  22. Hi Joost,

    Thank you for this information. I hope you won't mind answering two short questions: After removing
    the XML Sitemaps plug-in from wordpress, and removing the two "sitemap" files that were written to the root directory (where our blog is located) - are there any other files to remove and do we need to "notify" google, etc. of the removal?

    Thanks again!
    -Todd

Comments closed, if you feel you have something to say:
drop me a line.

5 Trackbacks to “Why you should not use XML Sitemaps”

  1. links for 2007-08-01 at James A. Arconati - Wed, August 1st, 2007 at 10:21
  2. Virtues of a sitemap for your blog/website | SEO - Sat, August 30th, 2008 at 13:22