5

I am trying to code the following game in R:

  • Roll a dice until you observe a 4 followed by a 6
  • Count how many times it took you to observe a 4 followed by a 6
  • Repeat these first two steps 100 times
  • Calculate the average number of times it took to observe a 4 followed by a 6

I tried to manually simulate this as follows - I first used the "runif" command in R to "roll a dice" a large number of times, hoping that you will eventually see a 4 followed by a 6 (I don't know how to code this using "do until loops"). I repeated this 100 times and put all these rolls into a data frame:

roll_1 = floor(runif(100, min=1, max=6))

roll_2 = floor(runif(100, min=1, max=6))

roll_3 = floor(runif(100, min=1, max=6))

roll_4 = floor(runif(100, min=1, max=6))

roll_5 = floor(runif(100, min=1, max=6))

#etc 

roll_100 = floor(runif(100, min=1, max=6))

all_rolls = data.frame(roll_1, roll_2, roll_3, roll_4, roll_5, roll_100)

This looks as follows:

head(all_rolls)
  roll_1 roll_2 roll_3 roll_4 roll_5 roll_100
1      4      2      5      3      1        4
2      3      2      4      4      1        2
3      1      3      1      4      2        1
4      3      2      1      4      4        3
5      4      1      2      2      5        5
6      2      3      3      5      3        1

I then exported this data frame into Microsoft Excel and manually inspected each column and counted the row number at which a 6 appears when preceded by a 4. I then averaged this number for all columns and calculated the average number of times you need to roll a dice before you observe a 4 followed by a 6. This took some time to do, but it worked.

I am looking for a quicker way to do this. Does anyone know if "do until" loops can be used in R to accelerate this "game"?

Thanks

stats_noob
  • 5,401
  • 4
  • 27
  • 83
  • R has while loops. The syntax can be found on the `?Control` help page. This might be a good place to start: https://stackoverflow.com/questions/29597617/number-of-dice-rolling-till-reaches-a-stop-value – MrFlick Sep 07 '21 at 02:36

3 Answers3

3

Instead of runif, I would sample 1:6 value since a die would have only values from 1 to 6 and will not have values like 1.23 etc.

This is how you can use while loop -

roll_from_4_to6 <- function() {
  n <- 1:6
  i <- 1
  previous_4 <- FALSE
  while(TRUE) {
    current_value = sample(n, 1)
    i <- i + 1
    if(previous_4 && current_value == 6) break
    previous_4 <- current_value == 4
  }
  i
}

Run it once.

roll_from_4_to6()

Run it 100 times and take the average.

mean(replicate(100, roll_from_4_to6()))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • @ Ronak Shah: Thank you so much for your answer! This is perfect! Just a question - For each "roll_from_4_to6", is there any way to save the number of times you had to roll the dice before observing the 4 and the 6? This way, we could analyze the histogram of these numbers? Thank you so much for all your help! – stats_noob Sep 07 '21 at 02:53
  • Yes, `roll_from_4_to6()` returns the number of times you had to roll the dice to get 4 followed by 6. So if you want the histogram you can do `hist(replicate(100, roll_from_4_to6()))` – Ronak Shah Sep 07 '21 at 02:55
2

I considered a different approach to solve this problem, deviating from the exact instructions you received.

Create a sequence of rolls that is extremely large, so you can find 100 cases in which a 6 follows a 4:

x = sample(1:6, 1e6, TRUE)

The mean of rolls needed to get a 6 after a 4 is:

mean(diff(which(x == 4 & data.table::shift(x) == 6)[1:100]))

What you're doing there:

  • x == 4 & data.table::shift(x) == 6 is a vector of records for which a 4 is followed by a 6. This vector is a bunch of FALSEs and TRUEs.
  • which(x == 4 & data.table::shift(x) == 6)[1:100] is the indexes of those TRUEs (the first 100 TRUEs)
  • diff tells us how many rolls there were between consecutive matches.
  • mean gives us the average of the last value.
PavoDive
  • 6,322
  • 2
  • 29
  • 55
0

Sampling from dice is following categorical distribution. By using rcat function from extraDistr package, you can sample from categorical distribution

roll_game <- function() {
  count <- 2
  dices <- rcat(2, c(1/6 ,1/6, 1/6, 1/6, 1/6, 1/6))
  while(!(rev(dices)[2] ==4 && rev(dices)[1] ==6 )){
    dices <- c(dices, rcat(1, c(1/6 ,1/6, 1/6, 1/6, 1/6, 1/6)))
    count <- count+1
  }
  count
}

mean(replicate(100, roll_game()))

will get your answer

Park
  • 14,771
  • 6
  • 10
  • 29