I'm struggling with this. My data have start/end points of each active period. I'd like to calculate the number of active days.
My data are like this:
data <- data.frame(
"ID" = c("A","A","B","B","C","C","C"),
"start" = c("2022-09-01", "2022-09-01", "2022-09-04", "2022-09-05", "2022-09-25", "2022-09-30", "2022-09-30"),
"end" = c("2022-09-01", "2022-09-02", "2022-09-05", "2022-09-07", "2022-09-27", "2022-09-30", "2022-10-02")
)
I need sums of active days. I had tried as below:
library(tidyverse)
library(lubridate)
data$start <- ymd(data$start)
data$end <- ymd(data$end)
dt_interval <- data %>%
mutate("duration" = 1 + end - start)
dt_interval$duration <- as.numeric(dt_interval$duration)
dt_int_wide <- dt_interval %>%
group_by(ID) %>% mutate(num = row_number()) %>%
pivot_wider(id_cols = ID, names_from = num, values_from = duration)
dt_int_wide <- column_to_rownames(dt_int_wide, var="ID")
dt_int_wide %>%
mutate("active_days" = apply(dt_int_wide, 1, function(x)sum(na.omit(x))))
1 2 3 active_days
A 1 2 NA 3
B 2 3 NA 5
C 3 1 3 7
I noticed that I have counted the same date multiple times.
I want the result as A=2,B=4,C=6
I will appreciate any solution that can count the days without duplication.
Thanks.