I'm trying to figure out the quickest way to get survival analysis data into a format that will allow for time varying covariates. Basically this would be a python implementation of stsplit
in Stata. To give a simple example, with the following set of information:
id start end x1 x2 exit
1 0 18 12 11 1
This tells us that an observation started at time 0, and ended at time 18. Exit tells us that this was a 'death' rather than right censoring. x1 and x2 are variables that are constant over time.
id t age
1 0 30
1 7 40
1 17 50
I'd like to get:
id start end x1 x2 exit age
1 0 7 12 11 0 30
1 7 17 12 11 0 40
1 17 18 12 11 1 50
Exit is only 1 at the end, signifying that t=18 is when the death occurred.