I successfully implemented the following task, but I only got it to work with a small test dataset. For my real dataset, it just keeps calculating forever.
I have a 400x400 transition probability matrix in R. A user hits "Conversion" if she converts on the Markov walk. The absorbing state is "Null" for all users. "Start" is my beginning state.
Two things I need to calculate:
- Hit state s_j on a random walk beginning at "Start"
- Hit "Conversion" on a random walk beginning at each of the 397 other states
First one is easy in R:
v <- numeric(length = ncol(transitionMatrix1))
v[1] <- 1
i <- 2
R0 <- v%*%(transitionMatrix1 %^% 1)
R <- R0
repeat {
R1 <- v%*%(transitionMatrix1 %^% i)
R <- rbind(R, R1)
if (rowSums( R[nrow(R)-1,] - R1 ) == 0) {
#if (rowSums( R[nrow(R)-1,] - R1 ) < epsilon) {
break
}
else {
i <- i+1
}
}
visit1 <- colSums(R)
I successfully implemented 2., but I only got it to work with a small matrix. It takes forever with a big one:
w <- i
C1 <- matrix( nrow = w, ncol = ncol(transitionMatrix1))
for (i in 1:ncol(transitionMatrix1)) {
x <- numeric(length = ncol(transitionMatrix1))
x[i] <- 1
for (j in 1:w) {
C1[j,i] <- x%*%(transitionMatrix1 %^% j)[,ncol(transitionMatrix1)-1]
}
}
convert1 <- colSums(C1)
I should probably not use loops. Unfortunately, I did not succeed in vectorizing said operations.