2

What significance test should you use for a percentage metric with more than two experiments?

For example,

Version | Clicks | Impressions
A       | 5      | 1,763
B       | 4      | 1,672
C       | 2      | 1,689

How sure are we that verison A really is superior to the other two?

ʞɔıu
  • 47,148
  • 35
  • 106
  • 149

1 Answers1

4

In the past I personally have done a pairwise G-tests between the top and the bottom, multiplying the confidence by a fudge factor of n choose 2 to account for the fact that there are n choose 2 possible pairs that could have been the most extreme. Theoretically this is overly conservative, but it worked for me.

See http://elem.com/~btilly/effective-ab-testing/ for more.

btilly
  • 43,296
  • 3
  • 59
  • 88