2

I am trying to create a table with random entries from a central hypergeometric distribution where the column and row totals are fixed.

However I can get the column sums to be fixed and equal but not the row sums. I have read other answers but none seem to talk specifically about how to do it, my R knowledge is pretty basic and could do with some help or a point in the right direction.

To get the values from a central hypergeometric distribution I am using the BiasedUrn package.

For example:

N <- 50
rand <- 10
n1 <- 25
odds0 <- rep(1,K)  
m0 <- rep(N/K,K)
library(BiasedUrn)
i <- as.table(rMFNCHypergeo(nran=rand, n=n1, m=m0, odds=odds0))
addmargins(i) 
             A   B   C   D   E   F   G   H   I   J Sum
       A     5   3   5   7   5   5   6   6   5   5  52    
       B     8   7   4   5   5   6   3   4   5   4  51
       C     3   6   4   4   4   5   6   8   5   4  49
       D     4   4   6   3   6   4   5   3   3   5  43
       E     5   5   6   6   5   5   5   4   7   7  55
       Sum  25  25  25  25  25  25  25  25  25  25 250

Where I'm looking to keep all the column sums equal to 25, and all the row sums equal to another number which I can choose such as 50.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453

1 Answers1

4

Are you looking for the r2dtable function from base R?

set.seed(101)
tt <- r2dtable(n=1,c=rep(25,6),r=rep(50,3))
addmargins(as.table(tt[[1]]))
##       A   B   C   D   E   F Sum
## A     7   9   7  11   9   7  50
## B    10   7  10   6   7  10  50
## C     8   9   8   8   9   8  50
## Sum  25  25  25  25  25  25 150
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • Thankyou, this is exactly what i need on the margins. However is there a way to have more control over the values inside the table? Such as how they are distributed? –  Feb 13 '15 at 10:49
  • 1
    Fixed margins are a pretty strong constraint. `r2dtable` certainly doesn't have any more flexibility, but if you can compute a likelihood for a table under the distribution you're interested in, you might be able to do rejection sampling reasonably efficiently. Can you give more context? – Ben Bolker Feb 13 '15 at 15:30