I am trying to perform a Cox proportional hazards analysis with a time-dependent covariate in R. I have read the relevant documentation (Therneau et al.) and several tutorials but am struggling to format my data properly given how the covariate is structured in the data. It is a segmented time-dependent/time-varying covariate.
The outcome of interest is "death", the time variable is "fu", and the time-dependent covariate is included in the "asp" AKA aspirin columns. E.g. for the first case, the patient was not taking aspirin at baseline, 30 days, and 1 year but was taking it at the time of last follow-up, in this case 2275 days. Periods represent missing data.
| ID | aspb | asp30 | asp1y | aspfu | death | fu |
|-------|------|-------|-------|-------|-------|-----------|
| 1479 | 0 | 0 | 0 | 1 | 0 | 2275 |
| 10523 | 1 | 1 | . | . | 1 | 41 |
| 25436 | 0 | 0 | 1 | 1 | 0 | 1773 |
I cannot figure out how to achieve the (start, stop] format of the table en masse, as there are 1000+ cases. It's also complicated by the fact that the time intervals are not necessarily consistent between cases, i.e. the time corresponding to "aspfu" can fall anywhere between 0 days, 30 days, 1 year, ...
I did accomplish this analysis in SPSS using the following notation for the time-dependent covariate: (T_ < 1)*aspb + (T_ >=1 & T_ < 31)*asp30 + (T_ >= 31 & T_ < 366)*asp1y + (T_ >= 366)*aspfu. But I am struggling to translate this to R.
Any guidance would be appreciated! Thank you very much!