5

What would be a better way to calculate Confidence Interval (CI) for a proportion when the sample size is small and even the sample size is 1?

I am currently calculating CI for a Proportion in One Sample w/: enter image description here

However, my sample size is very small, sometimes it is even 1. I also tried An approximate (1−α)100% confidence interval for a proportion p of a small population using: enter image description here

Specifically, I'm trying to implement those two formulas to calculate the CI for proportion. As you see on the graph below, at 2018-Q1, the blue group has no CI around it because there is 1 out of 1 ppl choosing that item at 2018-Q1. If using the Finite Population Correction (FPC), it doesn't correct the CI if N is 1. So, my question is that what would be the best statistical way to solve this small sample size issue with 100% proportion.

enter image description here

  • It would be great if you can provide a package in python to calculate it? Thanks!
Sharedobe
  • 135
  • 1
  • 9
  • Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and __what has been done so far to solve it__. – scopchanov Aug 11 '18 at 01:37

1 Answers1

6

Try statsmodels.stats.proportion.proportion_confint

http://www.statsmodels.org/devel/generated/statsmodels.stats.proportion.proportion_confint.html

According to their documentation, you use it like this:

ci_low, ci_upp = proportion_confint(count, nobs, alpha=0.05, method='normal')

Where the parameters are:

  • count (int or array_array_like) – number of successes, can be pandas Series or DataFrame
  • nobs (int) – total number of trials
  • alpha (float in (0, 1)) – significance level, default 0.05
  • method (string in ['normal']) – method to use for confidence interval, currently available methods:

    • normal : asymptotic normal approximation
    • agresti_coull : Agresti-Coull interval
    • beta : Clopper-Pearson interval based on Beta distribution
    • wilson : Wilson Score interval
    • jeffreys : Jeffreys Bayesian Interval
    • binom_test : experimental, inversion of binom_test
Kelvin Wang
  • 627
  • 5
  • 21
  • 2
    Thank you for suggesting the python package for calculating the CI. Brown, LD, Cat, TT and DasGupta, A (2001). Interval Estimation for a proportion. Statistical Science 16:101-133 suggests that Wilson or Jeffreys methods for small n and Agresti-Coull, Wilson, or Jeffreys, for larger n. – Sharedobe Aug 16 '18 at 16:10
  • Thank you for this code. However, I am uncertain which method to go for- `normal` or `binom_test`. I have aggregated data from A/B test, for instance total sessions, sessions with orders. I can get ratio from these two. Any thoughts? – Death Metal May 06 '22 at 03:44
  • 1
    @DeathMetal Assuming your data is quantitative, you could perhaps subtract the two and see if the difference is significantly different from zero, or just two normal distributions. Binomial would be appropriate for when you have a series of True/False (bernoulli trials) and you wish to find a confidence interval for success rate. Normal is appropriate if you have a distribution of numbers. – Kelvin Wang May 06 '22 at 17:25
  • @KelvinWang thank you for your valuable comment. Appreciate it. :) – Death Metal May 10 '22 at 15:56