Probability from bivariate normal distribution

Question

I am trying to find the probability that P(X̄+0.5 < Ybar) using a bivariate normal distribution. X has a mean of 9 and variance of 3, Y has a mean of 10, and a variance of 5, and their covariance is 2, with a trial of 50 independent measurements. What I have so far is:

library("mvtnorm") 
meanVector <- c(9,10)
coV <- matrix(c(3, 2, 2, 5), nrow=2)
biv <- rmvnorm(50, mean=meanVector, sigma=coV)
geneA <- mean(biv[,1])
geneB <- mean(biv[,2])

But I am not sure where to go next, or if I am even on the right track.

Xbar and Ybar are fixed. You must mean Pr(X + 0.5 < Ybar) – IRTFM Oct 03 '16 at 06:12 — IRTFM, Oct 03 '16 at 06:12

Weihuang Wong · Answer 1 · 2016-10-03T05:27:08.993

If you want to estimate this value by simulation, you'll have to do the experiment many times. Here we have an experiment with 50 trials, which we simulate 2000 times. Note that P(Xbar + 0.5 < Ybar) can be rewritten as P(Xbar + 0.5 - Ybar < 0):

set.seed(123)
sims <- 2000
out <- replicate(sims, {
  biv <- rmvnorm(50, mean=meanVector, sigma=coV)
  geneA <- mean(biv[,1])
  geneB <- mean(biv[,2])
  geneA + 0.5 - geneB
})

We can plot the distribution of this quantity:

plot(density(out), main = expression(bar(X) + 0.5 - bar(Y)))
dout <- density(out)
dout <- cbind(dout$x, dout$y)[dout$x < 0, ]
dout <- rbind(dout, c(0, 0))
polygon(dout[,1], dout[,2], col = "grey", lty = 0)
abline(v = 0, col = "red")

where we are interested in the shaded area. From our simulated values, we compute the proportion of simulations where Xbar + 0.5 - Ybar < 0:

mean(out < 0)
# [1] 0.9585

which is close to the analytic value of P(Xbar +0.5 < Ybar).

X and Y are a two dimensional pair. Ybar is fixed.I think the poster has confused things by putting a bar over the X when what he really wants is Pr(X+0.5 < Ybar). So graphically you would want to plot a set of contour ellipses with center at (9,10) and a vertical line at 9.5 — IRTFM, Oct 03 '16 at 06:11
I don't think Xbar and Ybar are fixed. The expected values of each are fixed; but Xbar and Ybar are themselves are sample means and have sampling distributions that depend on the size of N. — Weihuang Wong, Oct 03 '16 at 06:15

score 0 · Answer 2 · answered Oct 03 '16 at 06:52

Try this:

meanVector <- c(9,10)
coV <- matrix(c(3, 2, 2, 5), nrow=2)
n <- 10000 # choose a large number
prob <- sum(replicate(n, {
     biv <- rmvnorm(50, mean=meanVector, sigma=coV)
     geneA <- mean(biv[,1])
     geneB <- mean(biv[,2])
     geneA + 0.5 < geneB})) / n
prob
[1] 0.9592

Probability from bivariate normal distribution

2 Answers2