0

I would like to model a sampling with replacement in R (like the urn model). That is, I have three different events (say: 1,2 and 3 (in fact they a categorical but I think this is not important at the moment)) and I know the probability of each event:

1 --> 0.5    
2 --> 0.2   
3 --> 0.3

Now I would like to take for example 50 samples with replacement and I would like to know the probabilities for each possible combination of the three different events.

My idea was to use rmultinom to generate these samples.

rmultinom(n=50,size=3,prob=c(0.5,0.2,0.3))

Now I get 50 randomly (?) Chosen samples, but I Need all possible combinations when I take 50 samples with replacement.

Jaap
  • 81,064
  • 34
  • 182
  • 193
KikiRiki
  • 47
  • 6
  • not `rmultinom(n=1, size=50,prob=c(0.5,0.2,0.3))`? – Khashaa Jul 24 '15 at 09:48
  • I think the main code I was looking for is `rmultinom(n=50,size=3,prob=c(0.5,0.2,0.3)`, though I still need to find out how to get any possible combination with the corresponding probability – KikiRiki Jul 24 '15 at 15:49

1 Answers1

0

If I'm understanding you correctly, the probabilities you're after can be calculated analytically.

My sense is that you want to treat all draws with the same numbers of 1s, 2s, and 3s as equivalent (see below if not). That is, 49 straight 1s followed by a 2 counts as the same 50-draw "outcome" as a 2 followed by 49 straight 1s.

In that case, what you're looking for is the value of the multinomial probability mass function evaluated for (p1 = 0.5, p2 = 0.2, p3 = 0.3) and at counts c1, c2, and c3, of 1s, 2s, and 3s (these should sum to 50). You can evaluate the multinomial PMF in R as:

counts = c(c1, c2, c3)
myProbs = c(0.5, 0.2, 0.3)
dmultinom(x = counts, prob = myProbs)

Now all that remains is to enumerate the all the possible combinations of 1s, 2s, and 3s that can occur in 50 draws. Calling the function nsimplex(3,50) (from the combinat package) tells us that there are 1326 of these, and calling the function xsimplex(3,50) (found in the same package) will generate them for us in matrix form. Here are the first five of the 1326 columns:

      [,1] [,2] [,3] [,4] [,5]
 [1,]   50   49   49   48   48
 [2,]    0    1    0    2    1
 [3,]    0    0    1    0    1

We then simply need to evaluate dmultinom for each column, using apply column-wise:

mySimplex = xsimplex(3, 50)
myProbs = c(0.5, 0.2, 0.3)
results = apply(mySimplex, 2, dmultinom, prob = myProbs)

The nth entry in the vector results will be the probability of the counts in the nth column of the matrix mySimplex.

Was this what you were after?

Distinct permutations: If you want to count distinct permutations differently, then the probability of any individual permutation is just:

 0.5^(c_1) * 0.2^(c_2) * 0.3^(c_3)

where c_1 is the number of 1s, c_2 the number of 2s, and c_3 the number of threes in the draw. But if you're looking to enumerate all of them, you might want to think again! The number of possible unique length-50 sequences in which each character is a 1, a 2, or a 3 is 3^50 > 10^23.

  • Thank you so much! That was exactly what I was looking for. When I use `Counts=c(c1,c2,c3)`R gives an error and says `Object c1 not found`. What is missing in my code ? Further, do you know how `nsimplex`determines the number of all possible combinations? I tried to reconstruct it manually but I could not find it. – KikiRiki Jul 26 '15 at 20:58
  • Sorry, I see how that might have been unclear. ``c1``, ``c2``, and ``c3`` in my example code were just placeholder variables for any particular counts of 1s, 2s, and 3s you might choose to evaluate. For example, to reproduce ``results[1]`` manually, you could do ``dmultinom(x = c(50, 0, 0), prob = c(0.5, 0.2, 0.3))`` – shiftedPriors Jul 26 '15 at 22:43
  • As for what ``nsimplex`` is doing under the hood, typing just the word ``nsimplex`` (without any parentheses or arguments) at the R command prompt yields the underlying R code (n.b. that this trick doesn't work for compiled R functions, but I have checked, and it does work for ``nsimplex``). – shiftedPriors Jul 26 '15 at 22:52