0

I have this table :

   record_id result date_start   date_end
1          1    pos                      
2          1        26/06/2019 28/06/2019
3          1        27/06/2019 29/06/2019
4          1        28/06/2019 30/06/2019
5          1        29/06/2019 01/07/2019
6          2    neg                      
7          2        01/07/2019 03/07/2019
8          2        02/07/2019 04/07/2019
9          2        03/07/2019 05/07/2019
10         2        04/07/2019 06/07/2019
11         2        05/07/2019 07/07/2019
12         3    pos                      
13         3        07/07/2019 09/07/2019
14         3        08/07/2019 10/07/2019

I want to calculate the difference of date for each row, no problem with that. What i want after that is to analyse the group of "pos" and "neg" separately. But i have not the value of result in my data when i have the date. This is data imported from REDCap, with repeat instruments. I use tidyverse, and i think dplyr could help, isn't it a pivot_wider i must do ? I've try, but no way ...

Thanks if anyone could help ...

wibeasley
  • 5,000
  • 3
  • 34
  • 62
Aytan
  • 136
  • 1
  • 9

1 Answers1

2

Like this, to e.g., calculate the mean date difference per group?

library(tidyverse)
library(lubridate)
df %>% 
  fill(result, .direction = "down") %>% 
  filter(!is.na(date_start)) %>% 
  mutate(date_start = dmy(date_start),
         date_end = dmy(date_end)) %>% 
  group_by(result) %>% 
  summarise(mean_date_dif = mean(date_end - date_start))

#`summarise()` ungrouping output (override with `.groups` argument)
## A tibble: 2 x 2
#  result mean_date_dif
#  <chr>  <drtn>       
#1 neg    2 days       
#2 pos    2 days 

Data

df <- tibble::tribble(
        ~record_id, ~result,  ~date_start,    ~date_end,
                1L,   "pos",           NA,           NA,
                1L,      NA, "26/06/2019", "28/06/2019",
                1L,      NA, "27/06/2019", "29/06/2019",
                1L,      NA, "28/06/2019", "30/06/2019",
                1L,      NA, "29/06/2019", "01/07/2019",
                2L,   "neg",           NA,           NA,
                2L,      NA, "01/07/2019", "03/07/2019",
                2L,      NA, "02/07/2019", "04/07/2019",
                2L,      NA, "03/07/2019", "05/07/2019",
                2L,      NA, "04/07/2019", "06/07/2019",
                2L,      NA, "05/07/2019", "07/07/2019",
                3L,   "pos",           NA,           NA,
                3L,      NA, "07/07/2019", "09/07/2019",
                3L,      NA, "08/07/2019", "10/07/2019"
        )
Ahorn
  • 3,686
  • 1
  • 10
  • 17