Some people believe that “bigger is better” when it comes to sample size – the more survey respondents you have, the more trustworthy your results.
True, a bigger sample gives you more precise estimates, which is necessary for your results to be trustworthy. It also gives you more statistical power to detect differences between estimates and a benchmark, or differences between control vs. treatment.
But a bigger sample is only necessary and not sufficient for results to be trustworthy. You also need to correct for nonresponse error, or the bias in survey results due to non-respondents having different characteristics from survey respondents.
My main recommendation – include only users impacted by the change in your analysis; exclude users who are not.
- Let’s say you have an e-commerce site. You want to test whether certain changes to your checkout page would increase conversion (% of users purchasing).
- You want to run a 2 x 2 Multi-Variable experiment with 1 control and 3 treatment groups.
- Your current conversion is 5%; you want to detect conversion changes as small as 10% (with the conventional 80% probability of detection and confidence level at 95%).
- According to this table in my blog post, you would need 30,400 users in each group, or 30400 x 4 = 121,600 users in total visiting your site. (That’s a lot!)
To calculate how many people you need in your experiment, you need to know 3 things:
1. How many groups are in your experiment?
- In an A/B experiment with a control and treatment group, you have 2 groups.
- In a 2 x 2 Multi-Variable experiment with 1 control and 3 treatment groups, you have 4 groups.
- The more groups you have, the more people you need.