3

The expected probability of randomly selecting an element from a set of n elements is P=1.0/n . Suppose I check P using an unbiased method sufficiently many times. What is the distribution type of P? It is clear that P is not normally distributed, since cannot be negative. Thus, may I correctly assume that P is gamma distributed? And if yes, what are the parameters of this distribution? Histogram of probabilities of selecting an element from 100-element set for 1000 times is shown here.

Is there any way to convert this to a standard distribution

Now supposed that the observed probability of selecting the given element was P* (P* != P). How can I estimate whether the bias is statistically significant?

EDIT: This is not a homework. I'm doing a hobby project and I need this piece of statistics for it. I've done my last homework ~10 years ago:-)

skaffman
  • 398,947
  • 96
  • 818
  • 769
Boris Gorelik
  • 29,945
  • 39
  • 128
  • 170
  • 1
    This is not a homework. I'm doing a hobby project and I need this piece of statistics for it. I've done my last homework ~10 years ago:-) – Boris Gorelik Oct 21 '08 at 21:45
  • Doesn't this all depend on your random number generator? If a random number generator was *perfect*, the probability is always 1/n for every pick regardless of number of picks and after 1000 picks, each element should have been picked 1000/n times - I seem to miss something here. – Mecki Oct 21 '08 at 21:47

4 Answers4

3

This is a clear binomial distribution with p=1/(number of elements) and n=(number of trials).

To test whether the observed result differs significantly from the expected result, you can do the binomial test.

The dice examples on the two Wikipedia pages should give you some good guidance on how to formulate your problem. In your 100-element, 1000 trial example, that would be like rolling a 100-sided die 1000 times.

Randy
  • 3,972
  • 19
  • 25
3

With repetitions, your distribution will be binomial. So let X be the number of times you select some fixed object, with M total selections

P{ X = x } = ( M choose x ) * (1/N)^x * (N-1/N)^(M-x)

You may find this difficult to compute for large N. It turns out that for sufficiently large N, this actually converges to a normal distribution with probability 1 (Central Limit theorem).

In case P{X=x} will be given by a normal distribution. The mean will be M/N and the variance will be M * (1/N) * ( N-1) / N.

Ying Xiao
  • 1,729
  • 1
  • 9
  • 5
  • "this actually converges to a normal distribution with probability 1": Nope, the convergence is in distribution (after suitable rescaling). The difference between both is not relevant here, but mathematically your statement is very wrong. – Alexandre C. Apr 23 '12 at 11:37
  • The convergence is also much slower for p close to 0 or 1, such that N has to be extremely large. – Andrew Mao Jan 25 '13 at 07:58
1

As others have noted, you want the Binomial distribution. Your question seems to imply an interest in a continuous approximation to it, though. It can actually be approximated by the normal distribution, and also by the Poisson distribution.

Alex Coventry
  • 68,681
  • 4
  • 36
  • 40
0

Is your distribution a discrete uniform distribution?

David Harkness
  • 35,992
  • 10
  • 112
  • 134
rice
  • 1,063
  • 6
  • 17