How to break a random walk at destination with R

Question

I have to generate a random walk for a sequence of positions in 2D. The person doing the random walk starts at (0,0). At every move, she goes left, right, up or down. The random walk stops if she comes back to (0,0), or until she made a 1000 steps. *I am using the R language

I have done this so far, but I am having trouble figuring out how to stop the random walk when she reaches (0,0) again. I only get two vectors back. Any help would be very appreciated. Thank you!

step.max<-1000
destination<-rbind(c(0,0))

Random.walk <- function(n=step.max){
  steps <- matrix(c(0,0,-1,1,0,-1,1,0),nrow = 4)
  walk <- steps[sample(1:5,n,replace = TRUE)] 
  walk.1 <-rbind(walk)
  ifelse(destination,break,apply(walk.1,2,cumsum))
  }
Random.walk(n)

Is your steps matrix correct? The first option is no movement which you did not mention in your description, then move 1 down, then move one left and up, and finally move one right. You do not have any option to move up only or left only. For 1000 steps you can generate the entire walk and then identify the first row that returns to 0, 0. — dcarlson, Dec 02 '22 at 04:36
I did add (0,0) to my matrix, and changed it to 5 rows. With which command do I identify the first row? In other words, how can I stop my function once that row is identified? Also, how can I make it, NOT stop at the first command, since I start at (0,0) as well? — Tryingtolearn, Dec 02 '22 at 04:55

dcarlson · Accepted Answer · 2022-12-03T14:52:49.507

If you want to stop when you reach the origin, you will have to use a loop. Alternatively you can generate a random walk of 1000 steps very quickly and identify when the origin was reached:

steps <- matrix(c(1, 0, -1, 0, 0, 1, 0, -1), nrow = 4, byrow=TRUE)
steps
#      [,1] [,2]
# [1,]    1    0  # Right
# [2,]   -1    0  # Left
# [3,]    0    1  # Up
# [4,]    0   -1  # Down

Now generate a random walk with 1000 steps:

set.seed(42)  # To create a reproducible example
rand <- sample.int(4, 1000, replace=TRUE)
moves <- steps[rand, ]         # 1000 random moves
position <- rbind(c(0, 0), apply(moves, 2, cumsum))  # Position after each move
home <- which(position[, 1] == 0 & position[, 2] == 0)  # When is position c(0, 0)?
home
# [1] 1   # We never revisit the origin in this random walk.

Now illustrate the results:

plot(position, type="l")
points(0, 0, cex=2, col="red")
points(position[1001, 1], position[1001, 2], cex=2, col="blue")

If you want break out of the walk when the position returns to c(0, 0), you will need a loop. Something like this:

steps <- matrix(c(1, 0, -1, 0, 0, 1, 0, -1), nrow = 4, byrow=TRUE)
position <- rbind(c(0, 0))
move <- 0
for (move in 2:1001) {  
    step <- steps[sample.int(4, 1), ]
    position <- rbind(position,  position[move-1, ] + step)
    if (position[move, 1] == 0 & position[move, 2] == 0) break
}
cat("move=", move, "position=", position[move, ], "\n")

The position variable will contain all of the locations visited during the walk.

If you want to know how often the walk returns to the origin before 1000 steps, we need to create a function for the walk and then use replicate to generate multiple walks, for example 10,000:

steps <- matrix(c(1, 0, -1, 0, 0, 1, 0, -1), nrow = 4, byrow=TRUE)
walking <- function(n) {
    rand <- sample.int(4, n, replace=TRUE)
    moves <- steps[rand, ]
    position <- rbind(c(0, 0), apply(moves, 2, cumsum))
    home <- which(position[, 1] == 0 & position[, 2] == 0)
    if (is.na(home[2])) 1000 else home[2]
}

results <- replicate(10000, walking(1000))
sum(results < 1000)/10000
# [1] 0.683

So about 68% of the time the walk ends in less than 1000 steps. You can increase the replications. If you run this several times, the percentage varies from 67 - 68%. You can narrow the range by increasing the number of replications, but the code will take longer to run.

Thank you so much for your time. I am trying the code right now, but I do have a small question.Since you did not add another (0,0) step into your matrix, the random walk will never go back to the origin, but the goal is not to stop it from reaching the origin. Simply to stop the command, ONCE it reached the origin. — Tryingtolearn, Dec 02 '22 at 05:33
Not correct. The origin is not a step. The vector `home` contains all of the times within 1000 steps the location c(0, 0) was reached. Try the code by deleting the `set.seed()` command. — dcarlson, Dec 02 '22 at 16:11
I've added an alternate approach which stops when the origin is revisited or 1000 steps are taken. — dcarlson, Dec 02 '22 at 17:18
The plot commands were designed for the first suggestion to show the entire walk of 1000 steps. If you are using them with the second solution, only `plot(position, type="l", xlab="X", ylab="Y")` will work and only if you have at least 3 positions (rows) in `position`. What does `str(position)` return? — dcarlson, Dec 02 '22 at 18:43
If the walk goes the whole 1000 steps, just change the 1001 in the plot commands to 1000 and it should work. Alternatively change `move in 2:1000` to `move in 2:1001` in the loop. — dcarlson, Dec 02 '22 at 18:51
If I would want to calculate the probability of returning to the origin in less than a 1000 steps, can I still use pnorm? I have never used **sample.int** before, so I am not too sure, how the probabilities are distributed. — Tryingtolearn, Dec 03 '22 at 05:02

jblood94 · Answer 2 · 2022-12-06T16:15:26.867

Here is a fully vectorized function that will return a matrix of positions in a random walk with k steps. The matrix will have k + 1 rows (it includes the original position at the origin) unless the walk returns to the origin.

walk2d <- function(k) {
  k1 <- k + 1L
  m <- matrix(0L, k1, 2)
  m[2:k1 + sample(c(0L, k1), k, 1)] <- sample(c(-1L, 1L), k, 1)
  m[,1] <- cumsum(m[,1])
  m[,2] <- cumsum(m[,2])
  i <- which.min(rowSums(abs(m[-1,])))
  if (i == 1L) m else m[1:(i + 1L),]
}

A walk that returns to the origin on the 14th step:

set.seed(123)
walk2d(20L)
#>       [,1] [,2]
#>  [1,]    0    0
#>  [2,]   -1    0
#>  [3,]   -1   -1
#>  [4,]   -2   -1
#>  [5,]   -1   -1
#>  [6,]   -2   -1
#>  [7,]   -1   -1
#>  [8,]   -1    0
#>  [9,]   -1    1
#> [10,]   -2    1
#> [11,]   -2    0
#> [12,]   -1    0
#> [13,]   -1    1
#> [14,]    0    1
#> [15,]    0    0

A walk that does not return to the origin within k steps

set.seed(94)
walk2d(20L)
#>       [,1] [,2]
#>  [1,]    0    0
#>  [2,]    1    0
#>  [3,]    2    0
#>  [4,]    2    1
#>  [5,]    3    1
#>  [6,]    3    2
#>  [7,]    4    2
#>  [8,]    4    1
#>  [9,]    3    1
#> [10,]    2    1
#> [11,]    3    1
#> [12,]    4    1
#> [13,]    4    0
#> [14,]    4    1
#> [15,]    3    1
#> [16,]    4    1
#> [17,]    5    1
#> [18,]    6    1
#> [19,]    5    1
#> [20,]    4    1
#> [21,]    3    1

To estimate the probability of returning to the origin in less than 1000 steps, here is a vectorized function that will perform n simulations and return the proportion that returned to the origin within k steps. Note that walks can return to the origin only on even steps, so the probability of returning to the origin in less than 1000 steps is the same as the probability of returning within 998 steps.

library(Rfast)
library(parallel)

pReturn <- function(n, k) {
  kn <- k*n
  a <- array(0, c(k, n, 2))
  a[1:kn + sample(c(0, kn), kn, 1)] <- sample(c(-1, 1), kn, 1)
  a[,,1] <- colCumSums(a[,,1])
  a[,,2] <- colCumSums(a[,,2])
  sum(colSums(rowSums(abs(a), dims = 2) == 0) != 0)/n
}

pReturn(1e3, 998)
#> [1] 0.664

Setting n too large will result in memory problems, so call pReturn multiple times to get a better estimate.

cl <- makeCluster(detectCores() - 1L, type = "PSOCK")
clusterExport(cl, "pReturn")
invisible(clusterEvalQ(cl, library(Rfast)))
mean(unlist(parLapply(cl, 1:1e3, function(i) pReturn(1e3, 998)))) # 1M replications
#> [1] 0.678182

This compares well against the exact probability (building off http://oeis.org/A054474 in Mathematica, with m = k/2):

In[1]:= m = 499; gf[x_] = 2 - Pi/(2*EllipticK[4*Sqrt[x]]);
N[Total[(List @@ Normal[Series[gf[x], {x, 0, m - 1}]] /. x -> 1)[[2 ;;
       m + 1]]*Table[4^k, {k, -1, -m, -1}]]]

Out[2]= 0.678159

Amazing! I am not very familiar withe the L notation that you put after some numbers. Thanks for the new knowledge! — Tryingtolearn, Dec 02 '22 at 19:29
If I would want to calculate the probability of returning to the origin in less than a 1000 steps, can I still use pnorm? I have never used sample.int before, so I am not too sure, how the probabilities are distributed — Tryingtolearn, Dec 03 '22 at 05:03
See the update with the exact probability and a function to efficiently estimate it from simulation. — jblood94, Dec 06 '22 at 16:01

How to break a random walk at destination with R

2 Answers2