5

I want to estimate an exponential hazards model with one predictor in R. For some reason, I am getting coefficients with opposite signs when I estimate it using a glm poisson with offset log t and when I just use the survreg function from the survival package. I am sure the explanation is perfectly obvious but I can not figure it out.

Example

t <- c(89,74,23,74,53,3,177,44,28,43,25,24,31,111,57,20,19,137,45,48,9,17,4,59,7,26,180,56,36,51,6,71,23,6,13,28,16,180,16,25,6,25,4,5,32,94,106,1,69,63,31)
d <- c(0,1,1,0,1,1,0,1,1,0,1,1,1,1,0,0,1,0,1,1,1,0,1,0,1,1,0,0,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,0,1,1,1,1,1)
p <- c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,1,1,1)
df <- data.frame(d,t,p)

# exponential hazards model using poisson with offest log(t)
summary(glm(d ~ offset(log(t)) + p, data = df, family = "poisson"))

Produces:

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  -5.3868     0.7070  -7.619 2.56e-14 ***
p             1.3932     0.7264   1.918   0.0551 .

Compared to

# exponential hazards model using survreg exponential
require(survival)
summary(survreg(Surv(t,d) ~ p, data = df, dist = "exponential"))

Produces:

            Value Std. Error     z        p
(Intercept)  5.39      0.707  7.62 2.58e-14
p           -1.39      0.726 -1.92 5.51e-02

Why are the coefficients in opposite directions and how would I interpret the results as they stand? Thanks!

fmerhout
  • 164
  • 2
  • 11
  • Ok, so I am starting to get the idea reading [this](http://www.math.ku.dk/~richard/courses/regression2014/survival.html). Whereas the poisson model estimates the hazards, the survreg model is an accelerated failure time model. Since I am using an exponential model and not a Weibull, the coefficients are exactly the same just in opposite directions. I am still blanking on the interpretation here though. – fmerhout Feb 21 '16 at 22:50
  • 2
    This is easy. The response variables in the two model are different. For Poisson, you are modelling event count/status (since 0-1 only), thus the coef is like 'risk' or 'hazard', while in the `survreg` you model time, so the coef is like 'survival' (log time ratio indeed), which is negatively correlated to 'risk'. This higher the risk/hazard, the shorter the survival time. – Eric Feb 22 '16 at 00:05

1 Answers1

0

In the second model an increased value of p is associated with a decreased expected survival. In the first model the increased p that had a long t value would imply a higher chance of survival and a lower risk. Variations in risk and mean survival time values of necessity go in opposite directions. The fact that the absolute values are the same comes from the mathematical identity log(1/x) = -log(x). The risk is (exactly) inversely proportional to mean lifetime in exponential models.

fmerhout
  • 164
  • 2
  • 11
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Thanks, this is exactly what I was looking for. Of course, some credit should also go to @Eric who was first to provide this answer - just not as an answer. – fmerhout Apr 03 '16 at 05:05
  • You might also look at: `summary(survreg(Surv(t,d) ~ p, data = df, dist = "weibull", scale=1))` and note the commentary in the Examples section of `?survreg` – IRTFM Apr 03 '16 at 06:18