As long as you have mean vector and covariance matrix, simulating multivariate normal is very simple, via MASS:::mvrnorm
. Have a look at ?mvrnorm
for how to use this function.
If you do not have special requirement on the covariance matrix, i.e., a random covariance matrix will do. You need to first create a proper covariance matrix first.
A covariance matrix must be positive-definite. We can create a positive-definite matrix by taking crossproduct of a full-rank matrix. That is, if an n * p (n >= p)
matrix X
has full column rank, A = X' %*% X
is positive-definite (hence a proper covariance).
Let's first generate a random X
matrix:
p <- 100 ## we want p-dimensional multivariate normal
set.seed(0); X <- matrix(runif(p * p), p, p) ## this random matrix has full rank
Then get a covariance matrix:
COV <- crossprod(X) ## t(X) %*% X but about 2 times faster
We also need mean vector. Let's assume they are 0-mean:
mu <- rep(0, p)
Now we call MASS:::mvrnorm
for random sampling:
library(MASS) ## no need to install
x <- mvrnorm(1000, mu, COV) ## mvrnorm(sample.size, mean, covariance)
Now x
contains 1000 samples from 100-dimension (p-dimensional) multivariate normal distribution, with mean mu
and covariance COV
.
> str(x)
num [1:1000, 1:100] 1.66 -2.82 6.62 6.46 -3.35 ...
- attr(*, "dimnames")=List of 2
x
is a matrix, each row of which is a random sample. So in total we have 1000 rows.
For multivariate normal, marginal distribution is still normal. Hence, we can plot histograms for marginals. The following sketches the 1st, 10th, 20th and 30th marginal:
par(mfrow = c(2,2))
hist(x[, 1], main = "1st marginal")
hist(x[, 10], main = "10th marginal")
hist(x[, 20], main = "20th marginal")
hist(x[, 30], main = "30th marginal")
