yearly survival rates with cox model [R]

Question

I am quite new to survival analysis so my question can be naive to most of you, I apologize.

I have some left and right censored telemetry data.

I first had to estimate the effect of 3 dummy explanatory variables on survival, so I used an Anderson-Gill model (extension from Cox models). Here, the hazard is a function of Sex, study area (region) and age class (age is divided in 3 age class: kitten (0-1yr), yearling(1-2yrs) and adult(more than 2yrs)). The main “issue” is that age class variable is a time-dependent covariate. Some individuals were followed during some years and thus change age class during the study period. The output of this model provides the effect of each variable I am testing on survival (Sex, region and age class). I could easily predict survival (S(t)) with survfit() and have predicted curves that are representative of cohorts whose covariates correspond to the values in newdata.

My ultimate goal is to estimate a mean yearly survival rate for each cohort (i.e. male adult from northern region) in order to use these estimates as an input in an age-structured matrix population model. I first thought to make a simple average of survival estimates S(t) I get from survfit() function but then I realized that time-dependent covariates seem to make the estimation more complicated. I am even wondering whether it’s possible to estimate mean yearly survival rate for each cohort with Cox models. Maybe annual Kaplan-Meier survival estimates should do the trick? Is it possible to obtain it from survfit() ?

Does someone already had to do the same kind of analysis? Here is the R code that summarize what I have done to estimate the effect of each covariate:

 #create a surv object in order to adjust a Cox model on dataset
mydata.surv <- Surv(mydata$START,mydata$END, mydata$STATE) 

 # Andersen-Gill model
fit <- cph(formula = mydata.surv~SEX+AGE_CLASS+STUDYAREA, data=mydata, 
           method=c("exact"), se.fit=T, linear.predictors=TRUE,  
           x=T, y=T, surv="summary")

 #create new dataset in order to predict survival
newdata <- expand.grid(AGE_CLASS=c(1,2,3),SEX=c("M","F"),
                   STUDYAREA=c("north","mid","south"))  

 #predict survival for each cohorts
out <- summary(survfit(fit, newdata = newdata , type = "breslow"))

score 0 · Answer 1 · answered Jan 16 '18 at 17:59

I think this question should probably go over to Cross-Validated. But here's my take: You should be using a logistic regression model. Once a subject survives for a certain amount of time, they are not censored, but you observed a failure and they re-enter the cohort at a new age. They are not independent unless the survival process is conditionally memoryless (a complicated martingale property that has no bearing on statistics because it is an impossible assumption to test). You can construct an independent sample by observing a specific survival interval for each subject. Compare these results to the pseudolikelihood treating each subject specific age-interval as independent data, then inspect a random effects model and verify the variance of random intercept is relatively small compared to the 95% CIs of the fixed effects.

yearly survival rates with cox model [R]

1 Answers1