I am running into a roadblock in my survival analysis; I think it has to do with censoring type. Here is the first 30 lines of my survival data. tstart is when a patient is admitted and starts receiving the Intervention, tstop is either death (status = 1) or discharge (censored, status = 0):
tstart tstop status Intervention
1 2 14 0 FALSE
2 2 5 0 FALSE
3 2 10 1 FALSE
4 5 8 0 FALSE
5 6 10 0 FALSE
6 6 10 0 FALSE
7 7 10 0 FALSE
8 8 20 1 TRUE
9 8 25 0 FALSE
10 8 18 0 FALSE
11 8 11 0 FALSE
12 8 9 0 FALSE
13 9 11 0 FALSE
14 9 52 0 TRUE
15 9 26 1 FALSE
16 10 20 1 TRUE
17 10 14 0 FALSE
18 10 14 0 FALSE
19 10 11 0 FALSE
20 10 23 0 TRUE
21 10 26 0 TRUE
22 10 16 0 FALSE
23 11 21 0 TRUE
24 11 96 0 TRUE
25 11 14 0 FALSE
26 11 16 0 TRUE
27 11 14 0 FALSE
28 11 16 0 FALSE
29 11 16 0 FALSE
30 11 38 1 TRUE
Depending on how I enter this data into the coxph function, I get two different results. Namely:
# METHOD ONE:
> coxph (Surv (time = (tstop - tstart), event = status) ~ Intervention, data = df.use)
Call:
coxph(formula = Surv(time = (tstop - tstart), event = status) ~
Intervention, data = df.use)
coef exp(coef) se(coef) z p
InterventionTRUE -0.05975 0.94200 0.04727 -1.264 0.206
Likelihood ratio test=1.58 on 1 df, p=0.2084
n= 7362, number of events= 2364
# METHOD TWO:
> coxph (Surv (time = tstart, time2 = tstop, event = status) ~ Intervention, data = df.use)
Call:
coxph(formula = Surv(time = tstart, time2 = tstop, event = status) ~
Intervention, data = df.use)
coef exp(coef) se(coef) z p
InterventionTRUE -0.29936 0.74129 0.04902 -6.106 0.00000000102
Likelihood ratio test=35.67 on 1 df, p=0.000000002337
n= 7362, number of events= 2364
I thought the two methods would return the same hazard ratio, but the results are extremely different. Why is this? How can it be avoided?