How to induce correlations between two inverse cumulative probability distributions in [r]?

Question

I'd like to create a correlated inverse cumulative distribution. Currently for example I have two inverse distributions shown as follows but would like to induce a correlation of say -0.5 for example. Is there a way I can achieve this?


library(lognorm)
library(dplyr)

Var_a <- tbl_df(qlnorm(runif(1000), meanlog = 0.0326, sdlog = 0.0288))
var_b <- tbl_df(qlnorm(runif(1000), meanlog = 0.0452, sdlog = 0.0364))

cor(Var_a, var_b)

user63230 · Answer 1 · 2020-03-18T20:17:02.323

1

Would the following work for you?

set.seed(100)
x1 <- rnorm(1000)
y1 <- rnorm(1000) - .6 * x1

x2 = pnorm(x1)
y2 = pnorm(y1)

cor(cbind(x2, y2))
#            x2         y2
# x2  1.0000000 -0.4995593
# y2 -0.4995593  1.0000000

Var_a <- tbl_df(qlnorm(x2, meanlog = 0.0326, sdlog = 0.0288))
var_b <- tbl_df(qlnorm(y2, meanlog = 0.0452, sdlog = 0.0364))

cor(Var_a, var_b)
#            value
# value -0.5239145

update: still confused about what you are doing but if you just want to apply what i've done to 15 variables do something like this maybe?

library(MASS)
sigma <- matrix(.5, nrow = 15, ncol = 15) + diag(15)*.5  #your correlation matrix
sigma
vars <- mvrnorm(1000, mu = rep(0, 15), Sigma = sigma)
vars
cor(vars)
vars2 <- pnorm(vars)
cor(vars2)
#use each of these as variable in qlnorm

vars2 <- data.frame(vars2)
names(vars2)
vars2

vars2[paste("log_", 1:15)] <- lapply(vars2[, 1:15], function(x) {qlnorm(x, meanlog = 0.0326, sdlog = 0.0288)})
names(vars2)
vars2 <- vars2[, -c(1:15)]
cor(vars2)

edited Mar 18 '20 at 20:17

answered Mar 18 '20 at 17:19

user63230

4,095
21
43

Thanks, I've a list of over 15 variables with a correlation matrix that I'd like to simulate and retain the correlations. – Dal Mar 18 '20 at 18:12
1

I guess this would be an example of NORTA (NORmal To Anything) method. Can you explain how you decided on -0.6 for the offset for `y1`? – user2474226 Mar 18 '20 at 18:13
so you have 15 variables and you want them all to be correlated with -0.5? I don't understand, could you explain? i chose that as I wanted something around 0.5 – user63230 Mar 18 '20 at 18:54
Sorry, I do have a correlation matrix for these 15 variables that I'd like to impose. I just used two variables and a correlation of -0.5 as an example. I need to run a log-normal simulation of these 15 variables preserving their correlations – Dal Mar 18 '20 at 19:53

user2474226 · Accepted Answer · 2020-03-18T20:47:03.197

If you have 15 variables with a correlation matrix CC, you could use a Gaussian copula to get correlated uniform variates, using the Cholesky decomposition of CC, then invert those with your specified marginals as you did above. (See here, for example).

nv <- NROW(CC)
num_samples <- 1000
A <- matrix(rnorm(num_samples * nv), ncol = nv)
U <- pnorm(A %*% chol(CC))

If your 15 variables have their means and standard deviations stored in vectors means and stdevs, you could do:

rv <- sapply(1:nv, function(i) qlnorm(U[,i], meanlog = means[i], sdlog = stdevs[i]))

The rv are your simulated variates with close to the desired correlation structure, which you can check with cor(rv).

How to induce correlations between two inverse cumulative probability distributions in [r]?

2 Answers2