Finding conditional and joint probabilities from a simulation

Question

Consider the Markov chain with state space S = {1, 2} and transition matrix

and initial distribution α = (1/2, 1/2).

Suppose, the source code for simulation is the following:

alpha <- c(1, 1) / 2
mat <- matrix(c(1 / 2, 0, 1 / 2, 1), nrow = 2, ncol = 2) 

chainSim <- function(alpha, mat, n) 
{
  out <- numeric(n)
  out[1] <- sample(1:2, 1, prob = alpha)
  for(i in 2:n)
    out[i] <- sample(1:2, 1, prob = mat[out[i - 1], ])
  out
}

Suppose the following is the result of a 5-step Markov Chain simulation repeated 10 times:

> sim
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    2    1    1    2    2    2    1    1    1     2
[2,]    2    1    2    2    2    2    2    1    1     2
[3,]    2    1    2    2    2    2    2    1    2     2
[4,]    2    2    2    2    2    2    2    1    2     2
[5,]    2    2    2    2    2    2    2    2    2     2
[6,]    2    2    2    2    2    2    2    2    2     2

What would be the values of the following?

P(X₁ = 1, X₃ = 1)
P(X₅ = 2 | X₀ = 1, X₂ = 1)
E(X₂)

I tried them as follows:

mean(sim[4, ] == 1 && sim[2, ]== 1)
?
c(1,2) * mean(sim[2, ])

What would be (2)? Am I correct with the rest?

Kindly explain your response.

score 1 · Accepted Answer · answered May 10 '19 at 15:19

1

You are almost correct about 1: there is a difference whether you use && or &, see

?`&&`

It should be

mean(sim[1 + 1, ] == 1 & sim[1 + 3, ] == 1)

Then 2 is given by

mean(sim[1 + 5, sim[1 + 0, ] == 1 & sim[1 + 2, ] == 1] == 2)

where you may get NaN in the case if the conditional event {X₀ = 1, X₂ = 1} doesn't appear in your simulation.

Lastly, point 3 is

mean(sim[1 + 2, ])

since a natural estimator of the expected value is simply the sample average.

answered May 10 '19 at 15:19

Julius Vainora

47,421
9
90
102

@user366312, `mean(sim[2, ])` gives just a number, while `c(1,2)` is a vector. Even if `mean(sim[2, ])` were a vector, you would need to sum the result. Also, not always there are going to be both ones and twos among your results. A correct version of your solution would be `sum(as.numeric(names(table(sim[2, ]))) * table(sim[2, ]) / 10)`. Lastly, you need row 3 rather than 2. – Julius Vainora May 11 '19 at 16:30

score 0 · Answer 2 · answered May 10 '19 at 15:19

Take advantage of the problem structure, state 2 is an absorbing state. The only way for X₁=1 and X₃=1 is if it begins with 1 and in every intermediate step, it keep visiting state 1. Hence, the answer is (0.5)⁴=0.0625.

In terms of simulation, rather than

mean(sim[4, ] == 1 && sim[2, ]== 1

It should be

mean(sim[4, ] == 1 & sim[2, ]== 1

&& only check the first component.

For the second part, one possible way is to note that

P(X₅=2|X₀=1, X₂=1)=P(X₅=2,X₀=1, X₂=1)/P(X₀=1, X₂=1)

of which you can then first estimate the numerator and the denominator separately and then compute the ratio.

Alternatively, P(X₅=2|X₀=1, X₂=1)=P(X₅=2| X₂=1)=P(X₃=2| X₀=1)

For the third question, E(X₂) is a single number, it is not a vector. It can be estimated by mean(sim[3,])

Finding conditional and joint probabilities from a simulation

2 Answers2