I created a contingency table with the passengers data from the Titanic by the Hypergeometric sampling -That's mean that both of the marginal totals are preset and equals-. It was created crossing the Sex and Survivor columns of 328 cases -164 men and 164 women-, this is the code:
First, I ungroup the data and deleted the useless columns
titanic = as.data.frame(Titanic)
titanic = titanic[rep(1:nrow(titanic),titanic$Freq),]
titanic = titanic[,c(2,4)]
later, selected a sample of men
men = subset(titanic, titanic$Sex == 'Male')
men = men [sample(nrow(men),164), ]
table(men$Sex, men$Survived)
# No Yes
# Male 133 31
# Female 0 0
now the row of women must be filled in with the appropriate values
n = summary.factor(men$Survived)
womenYes = subset(titanic, (titanic$Sex == 'Female' & titanic$Survived=='Yes'))
womenYes = subset(womenYes[1:n[1], ])
womenNo = subset(titanic, (titanic$Sex == 'Female' & titanic$Survived=='No'))
womenNo = subset(womenNo[1:n[2], ])
women = merge(womenYes, womenNo, all = TRUE)
hyperSample = merge(men, women, all = TRUE)
table(hyperSample$Sex, hyperSample$Survived)
# No Yes
# Male 133 31
# Female 31 133
It works, but it looks like a bit ugly and I honestly think perhaps someone could find a much more elegant or efficient way to do it. Thanks.