0

To explain - I have a dataset in chronological order of game results. Each row shows the team name, the opponent, the date, and if they won or not. I want to group on two levels (both the team and the opponent) to see how many games in a row one team beat another. That I am able to do. What I want to add to this is to also retain the first date of that streak and the last date.

Here's some sample code for you to use:

library(tidyverse)
test <- data.frame(date = c(1:10),
               team = c(rep(c("red", "blue"),5)),
               opponent = c("black", "white", "black", "white", "white",
                            "black", "white", "white", "black", "white"),
               result = c(1,1,1,0,0,1,0,0,1,1))

test %>% 
  group_by(team, opponent) %>% 
  mutate(consec_wins = ifelse(result == 0, 0, sequence(rle(result)$lengths))) %>% 
  summarise(consec_wins = max(consec_wins))

output

# A tibble: 4 × 3
# Groups:   team [2]
  team  opponent consec_wins
  <chr> <chr>          <dbl>
1 blue  black              1
2 blue  white              1
3 red   black              3
4 red   white              0

This code was able to identify that team red beat team black three times in a row but doesn't state the start/end of that streak. I've tried adding a first() and last() function in summarize but noticed it does it on the group level (team and opponent) and not just the range of the winning streak.

I hope this is enough for you to work with. Much appreciated!

Jeff Henderson
  • 643
  • 6
  • 10
  • Can you show your expected output for the dates – akrun Apr 07 '23 at 06:33
  • Hi @akrun see NicChr's output below. I also had a follow up question maybe you can help with too. Instead of having the max column end with the final date of the streak, can I have it show the following game (which the team would have to had lost)? For instance, team red vs team black would still show 3 consecutive wins and the min would still be 1 but the max would be whatever date number after 9 they played team black again and lost. – Jeff Henderson Apr 07 '23 at 16:43

1 Answers1

1

Let me know if this works. I'm utilising data.table::rleid() for identifying records of unique streaks.

library(dplyr)
library(data.table)

test <- data.frame(date = c(1:10),
                   team = c(rep(c("red", "blue"),5)),
                   opponent = c("black", "white", "black", "white", "white",
                                "black", "white", "white", "black", "white"),
                   result = c(1,1,1,0,0,1,0,0,1,1))
output <- test %>%
  group_by(team, opponent) %>%
  mutate(consec_wins = ifelse(result == 0, 0, sequence(rle(result)$lengths))) %>%
  mutate(win_id = if_else(result == 0, 0, data.table::rleid(result))) %>%
  group_by(team, opponent, win_id) %>%
  mutate(min = min(date),
         max = max(date)) %>%
  group_by(team, opponent) %>%
  arrange(desc(consec_wins), desc(result)) %>%
  slice(1) %>%
  select(team, opponent, consec_wins, min, max)
output
#> # A tibble: 4 x 5
#> # Groups:   team, opponent [4]
#>   team  opponent consec_wins   min   max
#>   <chr> <chr>          <dbl> <int> <int>
#> 1 blue  black              1     6     6
#> 2 blue  white              1     2     2
#> 3 red   black              3     1     9
#> 4 red   white              0     5     7

Created on 2023-04-07 with reprex v2.0.2

NicChr
  • 858
  • 1
  • 9
  • PERFECT! Thanks so much this worked on my actual code. I have a follow up though now that I'm looking at it. What if I want to have the range extend from the start (as is now) to the game that ended the streak (so the game following the last game of the streak). For example, let's say team red played team black on date 12 after their 3 game win streak from example code and lost that game. Can I iterate it that way so the range is then when the streak was broken? @niccr – Jeff Henderson Apr 07 '23 at 16:24
  • I'm thinking a self join back to the original data with all dates and keep the second furthest back? I'll play around but you might have some helpful advice. thanks again. – Jeff Henderson Apr 07 '23 at 16:28
  • Glad it worked! I'm not too sure about the follow up problem, I briefly attempted it but not sure how to complete it without an example output. – NicChr Apr 10 '23 at 10:21