Probability of selecting an element from a set

Question

The expected probability of randomly selecting an element from a set of n elements is P=1.0/n . Suppose I check P using an unbiased method sufficiently many times. What is the distribution type of P? It is clear that P is not normally distributed, since cannot be negative. Thus, may I correctly assume that P is gamma distributed? And if yes, what are the parameters of this distribution? Histogram of probabilities of selecting an element from 100-element set for 1000 times is shown here.

Is there any way to convert this to a standard distribution

Now supposed that the observed probability of selecting the given element was P* (P* != P). How can I estimate whether the bias is statistically significant?

EDIT: This is not a homework. I'm doing a hobby project and I need this piece of statistics for it. I've done my last homework ~10 years ago:-)

This is not a homework. I'm doing a hobby project and I need this piece of statistics for it. I've done my last homework ~10 years ago:-) — Boris Gorelik, Oct 21 '08 at 21:45
Doesn't this all depend on your random number generator? If a random number generator was *perfect*, the probability is always 1/n for every pick regardless of number of picks and after 1000 picks, each element should have been picked 1000/n times - I seem to miss something here. — Mecki, Oct 21 '08 at 21:47

score 3 · Answer 1 · answered Oct 21 '08 at 21:58

This is a clear binomial distribution with p=1/(number of elements) and n=(number of trials).

To test whether the observed result differs significantly from the expected result, you can do the binomial test.

The dice examples on the two Wikipedia pages should give you some good guidance on how to formulate your problem. In your 100-element, 1000 trial example, that would be like rolling a 100-sided die 1000 times.

score 3 · Accepted Answer · answered Oct 21 '08 at 22:30

3

With repetitions, your distribution will be binomial. So let X be the number of times you select some fixed object, with M total selections

P{ X = x } = ( M choose x ) * (1/N)^x * (N-1/N)^(M-x)

You may find this difficult to compute for large N. It turns out that for sufficiently large N, this actually converges to a normal distribution with probability 1 (Central Limit theorem).

In case P{X=x} will be given by a normal distribution. The mean will be M/N and the variance will be M * (1/N) * ( N-1) / N.

answered Oct 21 '08 at 22:30

Ying Xiao

1,729
1
9
5

"this actually converges to a normal distribution with probability 1": Nope, the convergence is in distribution (after suitable rescaling). The difference between both is not relevant here, but mathematically your statement is very wrong. – Alexandre C. Apr 23 '12 at 11:37
The convergence is also much slower for p close to 0 or 1, such that N has to be extremely large. – Andrew Mao Jan 25 '13 at 07:58

score 1 · Answer 3 · answered Oct 24 '08 at 13:50

1

As others have noted, you want the Binomial distribution. Your question seems to imply an interest in a continuous approximation to it, though. It can actually be approximated by the normal distribution, and also by the Poisson distribution.

answered Oct 24 '08 at 13:50

Alex Coventry

68,681
4
36
40

score 0 · Answer 4 · edited Nov 07 '12 at 23:51

0

Is your distribution a discrete uniform distribution?

edited Nov 07 '12 at 23:51

David Harkness

35,992
10
112
134

answered Oct 21 '08 at 21:25

rice

1,063
6
17

Probability of selecting an element from a set

4 Answers4