Getting Google's last visit date from the Google API

When Vanessa Fox over at the Webmaster Central Blog said they would update the "retrieved on" date even when a page hadn't changed, it suddenly became very cool info. This means you can see when a URL was last spidered, even if you don't have a Sitemaps account for the domain. This triggered me to find a way get that date through the Google API, and do nice stuff with it. It worked, and this is how i did it.

First, we have to get the cached page from the Google cache through the API:

function GoogleCache($url) {
$soapclient = new soapclient('http://api.google.com/search/beta2');
$soapoptions = 'urn:GoogleSearch';
$params = array(
'key' => [INSERT YOUR GOOGLE API KEY HERE],
'url' => $url
);
$ret = $soapclient->call('doGetCachedPage', $params, $soapoptions);
$err = $soapclient->getError();
return $ret;
}

Now we have to make a function which uses this one, and gets the last visit date out of it:

function GoogleLastSpidered($url) {
$cache = GoogleCache($url);
preg_match("/.*as retrieved on (.*).
.*/",$cache,$matches);
return $matches[1];
}

If you want, you could now feed the output of this function to strtotime() to do stuff with the date, or you could just echo it.

[tags]seo, google api, google cache[/tags]

Related posts

  1. PHP5 and NUSOAP
  2. PHP-APC: Speed up your web applications!
  3. Get the number of popular digg posts for a URL
  4. "Plumbing the web" - duplicate content issues at Google Webmaster Central
  5. Searching for freshly indexed pages

Enjoyed this article?

Join 4714 subscribers and subscribe by daily or weekly emails or with RSS to receive more tips, tricks and ideas on improving your website!

Comments are closed.

Hosting by:
Hosted by MediaTemple Grid Services