For a simulation study, I want to generate a set of random variables (both continuous and binary) that have predefined associations to an already existing binary variable, denoted here as x
.
For this post, assume that x
is generated following the code below. But remember: in real life, x
is an already existing variable.
set.seed(1245)
x <- rbinom(1000, 1, 0.6)
I want to generate both a binary variable and a continuous variable. I have figured out how to generate a continuous variable (see code below)
set.seed(1245)
cor <- 0.8 #Correlation
y <- rnorm(1000, cor*x, sqrt(1-cor^2))
But I can't find a way to generate a binary variable that is correlated to the already existing variable x
. I found several R packages, such as copula
which can generate random variables with a given dependency structure. However, they do not provide a possibility to generate variables with a set dependency on an already existing variable.
Does anyone know how to do this in an efficient way?
Thanks!