Blocking your site’s search results?

Recently, Matt Cutts posted about search results in search results, as apparently Vanessa had updated the guidelines about this a bit after some uncertainty as to how you should handle your sites search results. The guideline now states:

Use robots.txt to prevent crawling of search results pages or other auto-generated pages that don’t add much value for users coming from search engines.

Of course you could start by arguing that your search result pages do add a lot of value and don’t block them from the SE on that account. What I find more intriguing though, is that Google is telling me to block them. Loads of people have been talking about this, and how Google apparently is not able to filter this out algorithmically, otherwise they wouldn’t have asked you to block these results. Now all that is fine with me, if Google doesn’t want me to let them index my search results, I won’t let them. I do however have a problem with how they’re asking me to block those results, and with the example Matt gave in his post.

As a result of that question, YouTube added a ‘Disallow: /results?’ line in its robots.txt file. That’s good because as Google recrawls web pages, we’ll see that and begin to drop those search results.

First of all, if you choose to block them through robots.txt, I’d advise you to do so only for Google. After all, we haven’t heard from Yahoo!, MSN and Ask on this, and why block pages that might get you good traffic from these search engines…

Next to that, I propose not blocking those pages, but doing a conditional redirect for Google. Think of it, if people are linking to your search results, these results were apparently adding value for them. Google, oddly enough, doesn’t want to index the page that was adding value for this customer, but the solution it gives for that is throwing away a lot of the link equity that apparently satisfied reader gave you.

So, in my opinion, the best solution, is to 301 redirect Google to the first result of the search that was linked to, that is, if you trust your site’s search engine enough. If that’s too slow, or too hard to code, You could just 301 redirect to your homepage.

Yoast.com runs on the Genesis Framework

Genesis theme frameworkThe Genesis Framework empowers you to quickly and easily build incredible websites with WordPress. Whether you're a novice or advanced developer, Genesis provides you with the secure and search-engine-optimized foundation that takes WordPress to places you never thought it could go.

Read our Genesis review or get Genesis now!

6 Responses

  1. MalteBy Malte on 13 May, 2007

    Are your sure, that links to pages blocked via robots.txt wont add link strength to your domain? I think you benefit from links to your search results even there is no redirect.

  2. Joost de ValkBy Joost de Valk on 13 May, 2007

    They probably woud add to your domain’s link strength, but they wouldn’t count as a link towards a specific page, while deeplinks can be so incredibly good for your rankings. My method preserves the complete equity of such links.

  3. LouieBy Louie on 16 May, 2007

    Google has a good point there.
    Usually the search page has no result what-so-ever if the query is empty, and some websites will either display all products that are in the database, show a sitemap or purely give you a plain text message “you need to enter a search criteria” or an example on your site:
    Can’t find what you’re looking for, yet you suspect it must be there? Search for it!“, so that’s a totally waste of time for google getting to that page for no reason.

    I know is not possible for a search engine to submit a form, but there are website that use non-standard form buttons with onclick javascript event and also a href tag in case javascript is disabled or probably plain text link to the search page.

    One solution i use myself, taking in consideration you can really make use of the search page is logging the search criteria into the database, which most of us running large websites are doing it for different purposes.
    Beside the search criteria (which has to be unique BTW, otherwise I increase count++ to the record found), i also record the total result returned, then run a database query (sorted by lastdate, resultcount DESC) and create a xml sitemap for search engine submission.
    In time your sitemap can get very large and new records are always added to the table, so it could turn out to be a useful page at the end.

  4. Apache AdminBy Apache Admin on 22 May, 2007

    I actually love telling google what to block in robots.txt files because it helps me see the overall “big picture” of my site the way googlebot sees it.

  5. HenriBy Henri on 26 October, 2007

    I ‘save’ searchresults in HTML files (not HTML, but it looks like HTML page). These files are nico for Google, because they’re sitemaps. Very handy for large tree structure.

  6. SirCommyBy SirCommy on 18 December, 2007

    I will definetely not do that, not even for google. But I’m a special, case, I have friendly SEO URLs rewrite mode ON and google only triggers nice short URLs (i.e. domain.com/results/news/page1.html) and
    it’s impossible to get in inexistent pages, unless I intentionally delete them. But of course, there are the search results (i.e. results.php?=034324932-4924320-4324 etc.) which should not be included, I can agree to that. Tired of finding over ‘Ooops, page/object not found.
    Cheers