How to calculate confidence intervals for crude survival rates?

Question

Let's assume that we have a survfit object as follows.

fit = survfit(Surv(data$time_12m, data$status_12m) ~ data$group)
fit

Call: survfit(formula = Surv(data$time_12m, data$status_12m) ~ data$group)

                   n events median 0.95LCL 0.95UCL
data$group=HF  10000   3534     NA      NA      NA
data$group=IGT    70     20     NA      NA      NA

fit object does not show CI-s. How to calculate confidence intervals for the survival rates? Which R packages and code should be used?

IRTFM · Accepted Answer · 2020-09-20T21:43:55.900

The print result of survfit gives confidnce intervals by group for median survivla time. I'm guessing the NA's for the estimates of median times is occurring because your groups are not having enough events to actually get to a median survival. You should show the output of plot(fit) to see whether my guess is correct.

You might try to plot the KM curves, noting that the plot.survfit function does have a confidence interval option constructed around proportions:

plot(fit, conf.int=0.95, col=1:2)

Please read ?summary.survfit. It is the class of generic summary functions which are typically used by package authors to deliver the parameter estimates and confidence intervals. There you will see that it is not "rates" which are summarized by summary.survfit, but rather estimates of survival proportion. These proportions can either be medians (in which case the estimate is on the time scale) or they can be estimates at particular times (and in that instance the estimates are of proportions.)

If you actually do want rates then use a functions designed for that sort of model, perhaps using ?survreg. Compare what you get from using survreg versus survfit on the supplied dataset ovarian:

> reg.fit <- survreg( Surv(futime, fustat)~rx, data=ovarian)
> summary(reg.fit)

Call:
survreg(formula = Surv(futime, fustat) ~ rx, data = ovarian)
             Value Std. Error     z       p
(Intercept)  6.265      0.778  8.05 8.3e-16
rx           0.559      0.529  1.06    0.29
Log(scale)  -0.121      0.251 -0.48    0.63

Scale= 0.886 

Weibull distribution
Loglik(model)= -97.4   Loglik(intercept only)= -98
    Chisq= 1.18 on 1 degrees of freedom, p= 0.28 
Number of Newton-Raphson Iterations: 5 
n= 26 

#-------------

> fit <- survfit( Surv(futime, fustat)~rx, data=ovarian)
> summary(fit)
Call: survfit(formula = Surv(futime, fustat) ~ rx, data = ovarian)

                rx=1 
 time n.risk n.event survival std.err lower 95% CI upper 95% CI
   59     13       1    0.923  0.0739        0.789        1.000
  115     12       1    0.846  0.1001        0.671        1.000
  156     11       1    0.769  0.1169        0.571        1.000
  268     10       1    0.692  0.1280        0.482        0.995
  329      9       1    0.615  0.1349        0.400        0.946
  431      8       1    0.538  0.1383        0.326        0.891
  638      5       1    0.431  0.1467        0.221        0.840

                rx=2 
 time n.risk n.event survival std.err lower 95% CI upper 95% CI
  353     13       1    0.923  0.0739        0.789        1.000
  365     12       1    0.846  0.1001        0.671        1.000
  464      9       1    0.752  0.1256        0.542        1.000
  475      8       1    0.658  0.1407        0.433        1.000
  563      7       1    0.564  0.1488        0.336        0.946

Might have been easier if I had used "exponential" instead of "weibull" as the distribution type. Exponential fits have a single parameter that is estimated and are more easily back-transformed to give estimates of rates.

Note: I answered an earlier question about survfit, although the request was for survival times rather than for rates. Extract survival probabilities in Survfit by groups

How to calculate confidence intervals for crude survival rates?

1 Answers1