0

I have the following survival dataset that I would like to split the interval by January 1st of each year. For example, for personid 1220, i would make the split at 1912-01-01, 1913-01-01, 1914-01-01, 1915-01-01. I tried to use survSplit but they can only do numeric vector. Can you please let me know if there any other way?

In the dataset below, time = EndDate - StartDate. Here is what I have so far:

test.ts <- survSplit(Surv(time, censor) ~ ., 
                          data = test,
                          cut = seq(0, 1826.25, 365.25),
                          episode = "tgroup")

but that only split by each year.

    ID        EndDate  StartDate censor time       status
1 1220 1915-03-01 1911-10-04      1 1244        Alive
3 4599 1906-02-15 1903-05-16      1 1006        Alive
4 6375 1899-04-10 1896-10-27      1  895        Alive
6 6386 1929-10-05 1922-01-26      0 1826  Outmigrated
7 6389 1933-12-08 1929-10-05      1 1525  Outmigrated
8 6390 1932-01-17 1927-07-24      1 1638 Dead 0-4 yrs
lmo
  • 37,904
  • 9
  • 56
  • 69
Meo
  • 140
  • 1
  • 9

1 Answers1

0

Not sure I understood what you wanted but it you want to replicate the information in your data frame for each year in the range of Start;End, you can do:

library(tidyverse)
library(lubridate)

df %>%
  as_tibble() %>%
  mutate(
    RangeYear = map2(StartDate, EndDate, function(start, end) {
      start <- `if`(day(start) == 1 && month(start) == 1, 
                    year(start), 
                    year(start) + 1)
      seq(start, year(end))
    })
  ) %>%
  unnest(RangeYear)
F. Privé
  • 11,423
  • 2
  • 27
  • 78