Analysing statistics just isn’t that easy

December 09th, 2013 – 29 Comments

Having a solid statistical and scientific background, I often find myself frustrated by research and data-analysis in User Experience Metrics, Conversion Optimization and Google Analytics. In my opinion, doing research and analysing statistics requires proper training and understanding of what you are doing. Am I the only one?

You need a brain to do statistics!

Just last week, I have resisted to various inclinations of throwing ‘Measuring the user experience’ by Tom Tullis and Bill Albert against the wall. After reading the book completely, I found it to be a brave attempt to explain statistics as well as a total over-simplification of doing research. In my view, such a simplification really messes up the reliability of results.

The common message in the online research community appears to be that research and statistics are easy and can be executed by everyone. True, all kind of packages like Google Analytics or Convert make doing statistics that much easier. But still… you really need a brain to do it!

3 pitfalls

Research methodology and simple descriptive statistics are not easy. In my first year of university, three quarters of the students failed their first (mainly descriptive) statistics exam. Also, my years of teaching made clear that mathematics and statistics are the most challenging subjects. It is hard! Executing research and analysing data without proper knowledge of both research designs and statistics can lead to serious misinterpretations in results. I will discuss 3 pitfalls:

1. Doing statistics with small amounts of data

I am not going to argue that statistical analysis with less than 30 observation is not possible, because there are tests (student T-test for example) specifically designed for doing just that. Still, one should be aware that small samples have limited power. This means that differences between two small samples will only be significant if the difference is obvious and large. For instance, if the old design of your checkout page did an average conversion of 2 % and the new design has a conversion rate of 20 %, then the difference will appear significant with 20 observations. But usually differences aren’t that obvious. Small differences or nuances cannot be tested with small samples.

More importantly, I really wonder whether you should do statistical analyses with a very small sample at all. I would always advice a qualitative approach if you have a sample of 15 individuals or less.  In a qualitative research design you gather in-depth understanding of human behaviour. Ask open questions and try to discover why visitors of your website buy your products, (dis)like your design of read your posts. Analysing these answers (in a non-statistical manner) will be of great value to increase the conversion of your site. ‘Measuring the user experience’ actually gives a nice introduction to a more qualitative approach of user experience research.

2. Representative sample

Equally important to the sample size is the question whether the sample is representative. Does the sample of individuals you research upon resemble the total population. An example:

If we would do a User Experience study of and we would ask totally random people to visit our website, the sample will not be representative. No offence, but to visit the Yoast-website, you have to be some kind of nerd.  You can imagine that the User Experience of random people will probably greatly differ from those of nerds. A representative sample of our population would thus be a random sample of nerds. We would need nerds from all over the world, because our readers from the US probably differ from the ones we have in Europe, India or Australia. And maybe, because of a recent growth in our reader population, our current population also includes some non-nerds. We should definitely take into account the nerdiness of the individuals in our sample. Making a representative sample is hard, the more if you do not know exactly what your population looks like. Taking a large random sample takes care of most of these issues. But: especially with small samples, it is hard to make sure your sample is representative. And: a non-representative sample leads to non-representative (and thus worthless) results.

Validity & Reliability


The validity of a measurement tool (for example a question in a survey) tells us the degree to which the tool actually measures what it claims to measure. Sometimes it is referred to as accuracy.


Reliability is the extent to which a measurement gives consistent results. So, If you pose the same question to the same person twice, will answers be the same? A reliable measurement tool results in the same answers over and over again.

Difference between reliability and validity:

Imagine a person of 200 pounds stepping on the scale 5 times and gets readings of 15. 250, 95, 140 and 500 pounds. This scale is not reliable, the reading is different every time. If the scale consistently reads 150 pounds, the scale is reliable, because the reading are the same. However, the scale is not valid. The reading is wrong. It does not measure, what you want to measure.

3. Validity: GIGO

Website analytics is awesome because a lot of measuring is very easy. You can just count the number of visitors on your page and the number of clicks on a button. Attitudes towards your brand and self-reported issues with usability are much more difficult to measure though. If you want to measure these kinds of things, you could do a qualitative study with a small sample. But a quantitative design with a larger number of individuals is also possible. Possible but also challenging and difficult! The drafting of questions in a survey (especially with limited answering possibilities) is difficult and requires proper testing. You should make sure that your questions really measure what you want to know. Measuring what you want to measure is what we call validity of your measurements.  An example:

You want to measure the extend to which people like the design of your website. You ask whether they like the colour. The answers to this question indeed say something about the degree to which people like your website. But design is more than colour. You would probably need more questions to really capture the degree to which people like the design of your website.

If the questions you present to people are of bad quality, the data will become of bad quality as well. Thus remember GIGO: Garbage In, Garbage Out!

Interpreting invalid data (whatever sophisticated statistical analyses you will apply) will always lead to invalid results.


Research is definitely a very powerful tool. But, I think you should have some statistical and methodological background in order to interpret results and execute proper analyses. Taking the time to really understand what you are doing is required.

In this post, I have only discussed very basic methodological and statistical topics. If this is out of your league, you should definitely brush up your statistical knowledge (only if you want to do research, otherwise please do something more fun).

This being said, I do understand the seduction of simple statistical techniques that are available for a broad public. Testing is a beautiful tool to improve your website! For the future, I expect research to become more and more important for websites owners.

This is why we are currently brainstorming at Yoast about designing a tool or a service, which will help people with interpreting test results and statistics. We will keep you posted about developments in this new project!

29 Responses to Analysing statistics just isn’t that easy

  1. Andrew
    By Andrew on 9 December, 2013

    I find it even more difficult with Google hiding the keyword data.

    • Marieke van de Rakt
      By Marieke van de Rakt on 10 December, 2013

      I agree, that makes a lot of things more difficult!

      • David
        By David on 28 December, 2013

        Google Analytics is useless now in my opinion because they hide so much info. Their are other free programs out their that will do the same thing though just search Google.

      • David
        By David on 28 December, 2013

        Google Analytics is basically useless nowadays they withhold so much information but there are plenty of other free programs out there that can do the same thing just do a Google search a lot of them are free.

        • Joseph
          By Joseph on 2 January, 2014

          I totally agree. Google Analytics has turned into a visitor counter in my opinion. Even with their integration of Webmaster Tools it just shows apparently what sites are showing up for in search engines but all I am seeing is image results.

  2. Marc Queralt
    By Marc Queralt on 10 December, 2013

    When you are doing any kind of task you need to know the tools and methodologies in order to avoid pitfalls.

    Maths, statistics and analysis require brain and also time even if google analytics seems to make things simple.

    Really nice post. I agree with you in all points.

  3. Mike Yeats
    By Mike Yeats on 11 December, 2013

    Good points here! If YOAST can come up with something to really shine a light on this, then it will be really welcomed.

  4. Susan
    By Susan on 11 December, 2013

    Even with my own background in statistical analysis, the information currently available for website owners is hit or miss at best. It will be nice if Yoast could help with this issue. I’ll keep my eye on your site for further developments.

    BTW: Thanks for referring to me as one of the “nerds.” :D

  5. Shane Jennings
    By Shane Jennings on 11 December, 2013

    I have to recommend “How to Measure Anything: Finding the Value of “Intangibles” in Business”. He covers a lot of territory and anyone interested in using statistical data may find it insightful.
    On a personal note, I find it’s difficult to even know what to measure when looking for relationships. Often there is some simple, non-obvious metric that isn’t tracked but has a clear correlation with conversion or deepening user interaction. The path to discovery for these metrics usually involves physically watching someone who is unfamiliar with an interface who isn’t concious of being observed (like at a tradeshow or however you can “socially engineer” a situation where this is possible).
    I guess the point being, if you work with smaller samples, maybe you need more data about each sample and the findings can become more usable?

    • Marieke van de Rakt
      By Marieke van de Rakt on 19 December, 2013

      I think you are right, with smaller samples, you can go more in depth… measure more about fewer people. Your methods to analyze such data will be different then when analyzing larger samples. I am going to take a look at the book you recommended! Thanks

  6. Heikki Hyppänen
    By Heikki Hyppänen on 12 December, 2013

    Good article! This is really basic stuff, and you don’t need a background in statistics to get this far. Although I’m sure it’ll help. Think logically, study the topic and the system, and approach the data carefully, and you’re likely to get something useful out of it.

  7. Darshan Beloshe
    By Darshan Beloshe on 13 December, 2013

    Analysis for my projects were really cool until Google started reporting that not provided. Though, i found another way to track by landing pages.

  8. Samuel Albert
    By Samuel Albert on 15 December, 2013

    I don’t necessarily think that you need some statistical and methodological background to understand google analytics. I think over time you will be able to determine what constitutes a good statistical sample for your blog. This is especially true if you are getting consistent traffic.

  9. Midas
    By Midas on 16 December, 2013

    I think point 1 “Doing statistics with small amounts of data” is really important. We find a lot of people make this mistake and follow patterns that are too fresh and immediate.

    A small data set just lends itself to skew and sending you off in the wrong direction, which of course is damaging!

  10. Chris ODell
    By Chris ODell on 17 December, 2013

    Reading this article reminds me of the book ‘Software Estimation: Demystifying the Black Art’ by Steve McConnell. In it the author points out that if you want to do estimation properly you need to take a lot of measurements from previous projects, do the appropriate math and number crunch a lot of numbers. He also points out that for an awful lot of development projects this is probably overkill. Steve McConnell then goes on to say that good estimation results can still be achieved by using heuritics rather than a robust scientific method. Perhaps this is the approach, as website owners, we need to take? We know that we are not approaching the analysis in a scientific and robust way, but if we use the data we have and apply the small amount of brain power we have then decent trends and results can still be spotted. So maybe we shouldn’t call is statistical analysis, but maybe the ‘black art’ of statistical estimation?(mmm.. a better name is perhaps needed.)

  11. Marcel
    By Marcel on 18 December, 2013

    A good article and I definately agree with your conclusiion on research becoming more and more important.

  12. Neha
    By Neha on 19 December, 2013

    awesum …your points are great..i like your conclusion..research is really important…!!!

  13. Angelina
    By Angelina on 24 December, 2013

    i like your post…thanks for writing…!!

  14. sharma
    By sharma on 28 December, 2013

    your points are great..i like your conclusion..awesome…!!

  15. joseph
    By joseph on 28 December, 2013

    Being in the field of statistics, I just can say that once you know how to analyse them, your life would be much easier

  16. blog mobile
    By blog mobile on 29 December, 2013

    Good article! This is really basic stuff, and you don’t need a background in statistics to get this far. Although I’m sure it’ll help. Think logically, study the topic and the system, and approach the data carefully, and you’re likely to get something useful out of it.

  17. Rafa?
    By Rafa? on 29 December, 2013

    I was working for a short period of time as an analyst, people usually don’t understand how much wor it is

  18. David
    By David on 30 December, 2013

    To this day I primarily use GA. I still don’t understand all the data but the possibilities are endless and I think I get better in using and understanding Ga every day.

  19. sharma
    By sharma on 30 December, 2013

    i like your conclusion..awesome…!!

    By on 4 January, 2014

    I agree that hiding data by Google really complicates the analyses. They give us a tool and take away the major information from it. I personally think that ‘not provided’ is the biggest failure in 2013.

  21. Career Vendor
    By Career Vendor on 6 January, 2014

    Analyzing data from website traffic, analytic software, AdSense and other advertising campaigns is not so easy and everyone’s task. Analyzing data in right way can lead you to taste success in sort time.

  22. sharm
    By sharm on 6 January, 2014

    This is really basic stuff, and you don’t need a background in statistics to get this far. Although I’m sure it’ll help. Think logically, study the topic and the system, and approach the data carefully, and you’re likely to get something useful out of it.

  23. Nathan
    By Nathan on 6 January, 2014

    I agree. I have to use more tool to Analyzing data. It is not easy to me

Check out our must read articles about Analytics