Random number generation from multinomial distribution in R using rmultinom() function

Question

I would like to generate a sample of size 20 from the multinomial distribution with three values such as 1,2 and 3. For example, the sample can be like this sam=(1,2,2,2,2,3,1,1,1,3,3,3,2,1,2,3,...1)

the following code is working but not getting the expected result

> rmultinom(20,3,c(0.4,0.3,0.3))+1
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20]
[1,]    1    1    3    2    2    1    1    2    3     2     3     2     1     2     2     3     1     2     2     2
[2,]    2    1    2    1    3    2    4    2    1     2     2     1     1     2     1     2     3     2     3     3
[3,]    3    4    1    3    1    3    1    2    2     2     1     3     4     2     3     1     2     2     1     1

I am not expecting this matrix. Any help is appreciated?

I just named the sample /outcome as `sam`. Say, `sam=rmultinom(20,3,c(0.4,0.3,0.3))+1` — Uddin, Dec 03 '19 at 15:42
the expected output should be like this `1,2,2,2,2,3,1,1,1,3,3,3,2,1,2,3,...1`. The sample should contain `1,2 and 3`. — Uddin, Dec 03 '19 at 15:43
Are you sure you want a multinomial distribution? You seem to be describing just a random discrete distribution. It sounds like you want `sample(1:3,20,prob=c(0.4,0.3,0.3), replace=TRUE)` — MrFlick, Dec 03 '19 at 15:52

score 6 · Answer 1 · answered Jul 04 '20 at 16:31

I would like to generate a sample of size 20 from the multinomial distribution

No problem, but you should remember that each sample is a vector, e.g. if you roll three dice you can get (2,5,1), or (6,2,4), or (3,3,3) etc.
You should also remember that in rmultinom(n, size, prob) "n" is the sample size, and "size" is the total number of objects that are put into K boxes (when you roll three dice, the size is 3 and K=6).

with three values such as 1,2 and 3.

No problem, but you should remember that rmultinom will return the count of each value, i.e. you could think of your three values as of row names (your three values could be "red, green, blue", "left, middle, right", etc.)

> rmultinom(n=20, size=3, prob=c(0.4,0.3,0.3))
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20]
[1,]    2    1    1    2    1    1    3    1    1     3     2     1     0     0     0     1     3     2     2     1
[2,]    1    1    1    1    0    1    0    1    2     0     1     2     2     2     1     1     0     0     1     0
[3,]    0    1    1    0    2    1    0    1    0     0     0     0     1     1     2     1     0     1     0     2

In the first sample (first column) "1" occurs 2 times, "2" occurs 1 time, "3" occurs 0 times. In the second and third samples each value occurs 1 time,... in the seventh sample "1" occurs 3 times etc.
Since you are putting three (size=3) objects into K=3 boxes (there are as many boxes as the length of the prob vector), the sum of each column is the number of your objects.

For example, the sample can be like this sam=(1,2,2,2,2,3,1,1,1,3,3,3,2,1,2,3,...1)

This does not look like a sample of size 20, because the outcome of a single multinomial trial is a vector, not a number.

Let's return to dice. I roll size=3 dice:

> rmultinom(n=1, size=3, prob=rep(1/6,6))
     [,1]  
[1,]    0
[2,]    2
[3,]    0
[4,]    0
[5,]    1
[6,]    0

I get two "2"s and one "5". This is a sample of size 1. Here is a sample of size 10:

> rmultinom(n=10, size=3, prob=rep(1/6,6))
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    0    0    1    0    0    0    0    1    1     1
[2,]    1    1    0    3    0    0    1    1    1     1
[3,]    2    1    0    0    0    0    2    0    0     0
[4,]    0    0    2    0    1    1    0    1    0     1
[5,]    0    0    0    0    1    2    0    0    1     0
[6,]    0    1    0    0    1    0    0    0    0     0

HTH

Gregor Thomas · Answer 2 · 2019-12-03T16:15:40.537

0

Your code does 20 draws of size 3 (each) from a multinomial distribution---this means that you will get a matrix with 20 columns (n = 20) and 3 rows (length of your prob argument = 3), where the sum of each row is also 3 (size = 3). The classic interpretation of a multinomial is that you have K balls to put into size boxes, each with a given probability---the result shows you many balls end up in each box. Your code add 1 to everything, so it's as if each box already has 1 ball in it, to the sum of each row will actually be 6.

Your comments, and your description of the result you want doesn't sound like you care about "balls and boxes". It sounds like you want to draw 20 numbers, with replacement, from the set {1, 2, 3}. If this is the case, use sample:

sample(1:3, size = 20, replace = TRUE, prob = c(0.4,0.3,0.3))

edited Dec 03 '19 at 16:15

answered Dec 03 '19 at 15:53

Gregor Thomas

136,190
20
167
294

Well, it is either multinomial (as OP asked) or not. Multinomial has quite a specific property, like sum of samples equal to asked number. – Severin Pappadeux Dec 03 '19 at 15:56
3

@SeverinPappadeux Based on OP's comments, I think OP is using the term "multinomial" without understanding its definition. Hence my distinction between what OP's code does (multinomial) and my interpretation of OP's description (sample 20 numbers with replacement from a set of 3). – Gregor Thomas Dec 03 '19 at 15:59
1

@SeverinPappadeux , e.g., when OP says *"the expected output should be like this `1,2,2,2,2,3,1,1,1,3,3,3,2,1,2,3,...1`"*, it doesn't seem at all like they are thinking about the sum of samples. – Gregor Thomas Dec 03 '19 at 16:02
Well, I would be glad to hear any updates on OP question, but it could be done using multinomial distribution as I demonstrated – Severin Pappadeux Dec 03 '19 at 16:18
1

Also note OP's request *"I am not expecting this matrix"*. You may want to wrap your result in `c()` to get closer to what OP wants. – Gregor Thomas Dec 03 '19 at 16:30

Severin Pappadeux · Answer 3 · 2019-12-03T17:55:25.193

-1

How about

q <- rmultinom(20,2,c(0.4,0.3,0.3))+1

UPDATE

If one still want to follow multinomial PMF and have higher frequency of larger values, there is another variant

q <- 3 - rmultinom(20,2,c(0.4,0.3,0.3))

edited Dec 03 '19 at 17:55

answered Dec 03 '19 at 15:49

Severin Pappadeux

18,636
3
38
64

@MrFlick What exactly is `not right`? Maybe you shall get familiar yourself with binomial distribution? Take a look at PMF as well as my update – Severin Pappadeux Dec 03 '19 at 17:53
@MrFlick Why are you talking about mixing this or that? I'm quite aware what multinomial distribution is all about, its PMF, properties etc. And no, they do not sum to 20, sampled values sums to the second parameter with value of 2. This is what original OP question was and still is asked about. – Severin Pappadeux Dec 03 '19 at 22:14
@MrFlick `sample of size 20 from the multinomial distribution with three values such as 1,2 and 3` was the question. So yes, two propositions I posted would return sample size of 20 with 3 values from the sample, 60 values total, yes, using multinomial distribution. Maybe you shall read question again? As far as I can see you're trying to twist original OP question to your agenda, whatever it is. – Severin Pappadeux Dec 03 '19 at 22:17

Random number generation from multinomial distribution in R using rmultinom() function

3 Answers3

Linked