XML Sitemap PHP script

Recently, while working on the site for my father in law (in Dutch), I wanted to create an XML sitemap for the many publications on his site, that are downloadable PDF’s. I regularly add PDF’s to his site too, and since I’m a tad bit lazy I don’t want to keep updating the XML sitemap. So I wrote a small XML Sitemap PHP script, that looks for all the files in a directory of a certain type, grabs their last modified them, and throws them in an XML sitemap.

Then, when working on another site yesterday, a Base64 encoding and Base64 decoding experiment, I needed an XML sitemap yet again. So I improved up the XML Sitemap PHP Script a bit further and decided it should be released.

Why make an XML Sitemap for Static Files?

Let’s first address the “why” of this script: in lots of cases, you’ll have static files, either they’re PDF’s, or static PHP or HTML files that create a site. I want all of those in an XML sitemap for two reasons:

  • to tell Google that they’re there;
  • to be able to see in Google Webmaster Tools whether they’re all indexed.
This script assumes that all those files are in one directory. I know that’s a bit “lazy”, but if your site spans a lot of directories you probably should be using a CMS.

Configuring the XML Sitemap PHP script

Of course this script needs a bit of configuration before it’ll work well. It has the following constant & variables:

  • SITEMAP_DIR
    The directory to search for files in.
  • SITEMAP_DIR_URL
    The URL to the Sitemaps directory
  • RECURSIVE
    Whether or not the script should parse recursively.
  • $filetypes
    An array of all the file types that you wish to include in your XML sitemap.
  • $replace
    An array of all the files that should be replace with other URL’s, useful to, for instance, replace ‘index.php’ with an empty string, so it’ll look like just example.com/
  • $ignore
    An array of all the files to ignore in the XML sitemap, useful for your config.php, for instance
  • $xsl
    A relative path to the XSL file included in the script from the SITEMAP_DIR_URL location.
  • $chfreq
    The change frequency for files, can be ‘hourly’, ‘daily’, ‘monthly’ or ‘never’.
  • $prio
    The priority, a value between 0 and 1, since you can’t differentiate between files, there’s no big harm in setting them all to 1.

Styling the output of our XML Sitemap PHP Script

Of course, we’ll want our XML Sitemap to look good, as well as work well. For that we use an XSL stylesheet which is included in the download. It makes the XML sitemap look like this:

XML Sitemap PHP Script output, styled with XSL

XML Sitemap PHP Script output, styled with XSL

Download XML Sitemap PHP Script

I’ve added the whole script on Github, so you can play, fork, etc. Or you could just download the zip.

Update 2012-09-29: I’ve updated the script to work recursively and fixed a few minor issues.

Tags:


Yoast.com runs on the Genesis Framework

Genesis theme frameworkThe Genesis Framework empowers you to quickly and easily build incredible websites with WordPress. Whether you're a novice or advanced developer, Genesis provides you with the secure and search-engine-optimized foundation that takes WordPress to places you never thought it could go.

Read our Genesis review or get Genesis now!

24 Responses

  1. Herman dailybitsBy Herman dailybits on 19 July, 2011

    Nice script and indeed usefull for static websites. Before I manually created the sitemap and was indeed also as lazy to not update it everytime.

  2. James DowenBy James Dowen on 19 July, 2011

    This looks like a great script. I might use it on future website that I will build. Thanks for sharing!

  3. Craig CacchioliBy Craig Cacchioli on 20 July, 2011

    A useful script Joost.
    Do you have any plans to integrate this into the wonderful WordPress SEO plugin or does it already use this script?

  4. bartBy bart on 20 July, 2011

    Would be nice if you released this as a plugin Joost. Your ceo plugin is already top notch!

  5. JeremyBy Jeremy on 21 July, 2011

    Looking good Joost! I was wondering, is it possible to use some sort of cronjob to update the sitemap every week or month?

  6. neilBy neil on 21 July, 2011

    Excellent thanks JdV

  7. Robert VisserBy Robert Visser on 21 July, 2011

    Many of the sites with which I work have subdomains. Could you either provide or recommend how to config the php to work with subdomains. Thanks.

  8. Dale ReardonBy Dale Reardon on 22 July, 2011

    Hi,

    I am using your great wordpress SEO plugin for sitemaps. Is there anyway of making that plugin add pdf files to the sitemap?

    Thanks,
    Dale.

  9. EdBy Ed on 23 July, 2011

    Hey YoastFans, very interesting topic, as im not a programer or wedsite developer, but it sure is interesting all this techie language. Anyway, since im a blogger and I use WordPress, im having issues and im unable to get help from the support team at MSI.Hosting.The problem is when I try and enter my site im getting a message “502 Bad Gateway
    nginx/0.8.53″ anyone of you intellegent yoast geeks come help me. Also when can I find my sitemap, I hear all about this and I don’t know where to Look? So if of you brainy geeks can help I sure would appriciate it very much. Ed :)

    P.S.

    I’m a subscriber to your newsletter, and read your blog posts as I have an attach link to my blog, and got to say very interesting topics…

  10. OposicionesBy Oposiciones on 24 July, 2011

    Your article was helpful for me, thanks Joost.

    Best regards from Spain.

  11. César CoutoBy César Couto on 26 July, 2011

    Very interesting. You should also add the option of working with sub directories, I’m making the changes on the script for that, it can be useful for some people.

  12. Mark FisherBy Mark Fisher on 29 July, 2011

    This looks like a really useful tool.
    I would imagine it to be a simple task now to create more stylesheets for tasks, for example, internal link generation, or a menu generator module.

  13. CraigBy Craig on 30 July, 2011

    Excellent script….I would also like to see this incorporated into the plugin.

  14. JackBy Jack on 1 August, 2011

    Hi! First of all..your SEO plugin for WordPress is nothing short of uber awesome! I also have a handful of static pages and this script will do the trick nicely. One quick question (sorry for the newbie-ness but I’m really trying :)…which file do I point my Google Webmaster Tools to when reporting the site map? The xml-sitemap.php file?

    Lastly…if I set the frequency to ‘hourly’ does that mean the script checks my files in that directory that meet the allowed extensions (set in the array) every hour automatically? So that way my sitemap is always up to date?

    Thanks so much for all you do…we will making a donation through the Yoast WordPress plugin very soon!

    Jack

  15. Dave CainBy Dave Cain on 3 August, 2011

    Hey Joost, great job with this – are you planning on doing a html, url list and rss sitemaps with this?

  16. JayBy Jay on 5 August, 2011

    Thanks for the script Joost! I agree with Dale – any way to add pdf’s to your sitemap plugin?

  17. ArnieBy Arnie on 7 August, 2011

    Hello,

    I am using your WordPress-SEO plugin, it’s really great but there is already such an option for sitemaps. Can I install this plugin if they do not interfere with each other?

    Thanks,
    Arnie.

  18. WilmerBy Wilmer on 9 August, 2011

    Hi Joost.

    This very good what you’re doing, you’re a very skilled person, I hope you develop as plugins to make it easier to use.
    Your work will be rewarded.
    I do not know much about managing HTMLl, CSS, PHP … only the basics.
    Thank you.

  19. Koozai MikeBy Koozai Mike on 10 August, 2011

    That’s an excellent time save. No more manually updating HTML files until the end of time.

  20. SrinivasBy Srinivas on 13 August, 2011

    Thanks for sharing Joost. I’ll use in my future projects.

  21. StephanBy Stephan on 16 August, 2011

    Hi Joost:

    I understand the purpose of evergreen blog posts, but I don’t quite understand the infrequency with which you post them! Almost a month since the last post now?

    Jonesing for more…
    S

  22. Rob AtlantaHomesBy Rob AtlantaHomes on 18 August, 2011

    This is interesting. Aren’t there utilities on the web that will build XML sitemaps from static sites for you?

    I remember using them prior to moving to WordPress.

  23. Michael DorfBy Michael Dorf on 18 August, 2011

    Nice little utility, Joost. I’m using Arne Brachhold’s Sitemaps generator plugin on learncomputer.com. Will try to poke around to see if I can wedge this script into it. Should be a fun little project! Thanks!

  24. M-ABy M-A on 18 August, 2011

    This is such synchronicity!

    My co-worker recently had to work on a very old website, and needed to redirect all of the 100+ page links to the new site’s equivalents through htaccess. He did not want to do this manually, which involved finding each page, inside folders, etc. He told me he was working on a script similar to yours. I will pass this along and maybe he will be virtually hugging you if it does help him!

    Merci!