0

I have performed a Cox regression analysis including four variables (sex, age and two binary explanatory variables) which all have significant associations to outcome. I have used the coxph function from the "survival" package in R:

    library(survival)
cox <- coxph(Surv(time, status_risk==1) ~ sex + age + stone_number +stone_size, data=cox_cut)
    summary(cox1_3_cut)
Call:
coxph(formula = Surv(time, status_risk == 1) ~ sex + age + 
stone_number + stone_size, data = cox_cut)

n= 582, number of events= 48 
(82 observations deleted due to missingness)

                      coef exp(coef) se(coef)      z Pr(>|z|)    
sexfemale              0.76993   2.15961  0.34577  2.227 0.025966 *  
age                   -0.03222   0.96829  0.01201 -2.682 0.007311 ** 
stone_number>=2        0.60646   1.83393  0.29942  2.025 0.042821 *  
stone_size>10          1.02593   2.78969  0.29391  3.491 0.000482 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

                  exp(coef) exp(-coef) lower .95 upper .95
sexfemale                2.1596     0.4630    1.0966    4.2530
age                      0.9683     1.0327    0.9458    0.9914
stone_number>=2          1.8339     0.5453    1.0198    3.2980
stone_size>10            2.7897     0.3585    1.5681    4.9629

I would like to make a prediction score table including the four variables with 4 age stratified Groups (30, 40, 50, 60 years of age). All hazards in this table must be divided by one predefined hazard in order to obtain HR for each age group.

How to compute the HR with the 95% CI for each of these specific age-groups in R?

Daniel
  • 97
  • 1
  • 6
  • What code have you tried? Please provide example code and data. – Alex A. May 12 '15 at 16:13
  • If you want to display your data in an age-stratified way, you should also do the analysis that way. That is, split the data into age groups before the analysis and perform seperate Cox models for each age group. – shadow May 12 '15 at 16:15
  • Can you clarify what is requested when you write: "All hazards in this table must be divided by one predefined hazard in order to obtain HR for each age group." The exp(coef) you are offering are _hazard_ratios_ relative to the implicit "baseline hazard". In fact none of those values are estimated "hazards". It's possible you want survival predictions at particular times, but at the moment both the structure of the table and what sorts of values should appear is unclear. – IRTFM May 12 '15 at 17:08
  • @BondedDust: I have until now calculated the single hazards for all the "profiles" of people possible with the obtained variables in the model. An example: the hazard for a female, 30 years of age, with stone_number>=2 and stone_size>10: hazard = 2.16* 0.97^30 * 1.83 * 2.79. The lowest hazard possible was the older man with a single stone of smaller size. The ratio of these two hazards were then the HR. How to obtain this HR with a CI? – Daniel May 12 '15 at 17:43
  • Those are NOT hazards. In fact I'm not sure what they are. One possible interpretation: raising 0.97 to the 30th power would imply a constant risk of 0.03 over 30 intervals, but then you cannot multiply survival probabilities by estimates hazard ratios. Puzzlement ensues. – IRTFM May 12 '15 at 17:52
  • `2.16* 0.97^30 * 1.83 * 2.79` is the **hazard ratio** for females, age 30, with stone_number>=2 and stone_size>10 **Relative** to a male, age 0 with stone_number < 2 and stone_size<=10. Specifically, it is not a hazard, but a comparison of the hazards between the specified group and the reference group (where the reference group is probably not very relevant [i.e. age == 0]). – Jthorpe May 12 '15 at 19:55
  • probably easiest to think about the above quantity in additive terms `exp(0.77 + -0.03 * 30 + 0.61 + 1.03)` – Jthorpe May 12 '15 at 20:08
  • @Jthorpe, yes I just used the product of the hazards. Would you know how to compute the CI for this combined estimate? – Daniel May 13 '15 at 20:58

1 Answers1

1

As per @shadow's comment, the CI of the parameter estimate is based on the whole dataset, if you want age conditional CI's you need to subset your data. If instead you want to generate expected survival curves conditional on a set of covariates (including age) this is how you do it:

# Create a dummy dataset
df <- data.frame(sex = sample(c(T,F),100,T),
                 age = 50 + rnorm(100)*10,
                 foo = sample(c('a','b','c'),100,T),
                 bar = sample(c('x','y','z'),100,T),
                 status = sample(c(T,F),100,T),
                 start = 0,# required for the survfit with `individual = TRUE`
                 time = -log(runif(100)))

# fit the A coxph model in your full dataset data
cox <- coxph(Surv(start,time, status) ~ sex + age + foo + bar, data=df)


# create a data.frame with all the variables used in the formula
newData <- data.frame(sex = T,
                 age = 55,
                 foo = sample(c('a','b','c'),1),
                 bar = sample(c('x','y','z'),1),
                 status = T,# required but unused
                 start = 0,# required but unused
                 time = 1)# required but unused
# get a prediction from the fitted model, specifiying 'individual = TRUE'
pred  <-  survfit(cox, newdata=data.frame(newData),individual =TRUE)

# plot the survival curves 
matplot(x = cbind(pred$"time"),
        y = cbind(pred$surv,
                  pred$upper,
                  pred$lower),
        type = 'l',
        lty= c(1,2,2),
        main = 'Predicted Survial with 95% CI')

You may also wnat to inspect unclass(pred) and summary(pred).

Jthorpe
  • 9,756
  • 2
  • 49
  • 64