1

I have this data set about cows. I have calving dates and dry-off dates with some dates in between with no such events. The dates in between are irregular, meaning that there may be gaps. It looks about like this:

ID <- c("A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B")

date <- c("2022-01-01", "2022-01-05", "2022-01-06", "2022-01-07", "2022-01-10", "2022-01-12", 
                  "2022-01-13", "2022-01-16", "2022-01-17", "2022-01-18",
                  "2022-02-01", "2022-02-05", "2022-02-06", "2022-02-07", "2022-02-10", "2022-02-12", 
                  "2022-02-13", "2022-02-16", "2022-02-17", "2022-02-18")
                
event <- c("calved", "NA", "NA", "NA", "dry-off", "NA", "NA", "calved", "NA", "NA", 
           "calved", "NA", "NA", "NA", "dry-off", "NA", "calved", "NA", "NA", "NA") 
df <- data.frame(ID, date, event)
df$date <- as.Date(df$date)

What I want is a new column where the day of calving is "1" and then the days since calving (days in milk = DIM) are shown. Then, I want the day of dry-off as "0" and all the days the cow is dry until calving also as "0" The data frame would look like this:

DIM <- c("1", "5", "6", "7", "0", "0", "0", "1", "2", "3",
         "1", "5", "6", "7", "0", "0", "1", "4", "5", "6")

what_I_want <- data.frame(ID, date, event, DIM)

I can do the "1" and "0" with this:

df1 <- df %>%
  mutate(DIM = case_when(str_detect(event, "calved") ~ "1",
                         str_detect(event, "dry-off") ~ "0",
                         TRUE ~ ""))

But then I'm stuck. Thank you for your help!

Sotos
  • 51,121
  • 6
  • 32
  • 66
AlHu
  • 23
  • 3

1 Answers1

0

One way to do it is to create groups for event == calved. Here is how you can achieve this:

library(dplyr)

df %>% 
 group_by(ID, grp = cumsum(event == 'calved')) %>% 
 mutate(res = cumsum(c(1, diff(date))), 
        res = ifelse(cumsum(event == 'dry-off') > 0, 0, res))

# A tibble: 20 × 5
# Groups:   ID, grp [4]
   ID    date       event     grp   res
   <chr> <date>     <chr>   <int> <dbl>
 1 A     2022-01-01 calved      1     1
 2 A     2022-01-05 NA          1     5
 3 A     2022-01-06 NA          1     6
 4 A     2022-01-07 NA          1     7
 5 A     2022-01-10 dry-off     1     0
 6 A     2022-01-12 NA          1     0
 7 A     2022-01-13 NA          1     0
 8 A     2022-01-16 calved      2     1
 9 A     2022-01-17 NA          2     2
10 A     2022-01-18 NA          2     3
11 B     2022-02-01 calved      3     1
12 B     2022-02-05 NA          3     5
13 B     2022-02-06 NA          3     6
14 B     2022-02-07 NA          3     7
15 B     2022-02-10 dry-off     3     0
16 B     2022-02-12 NA          3     0
17 B     2022-02-13 calved      4     1
18 B     2022-02-16 NA          4     4
19 B     2022-02-17 NA          4     5
20 B     2022-02-18 NA          4     6
Sotos
  • 51,121
  • 6
  • 32
  • 66
  • Thank you, but for some reason this does not work with my original data. The created columns grp and res are all NA. Any idea what might be the problem? – AlHu May 13 '23 at 12:01