I am quite new to survival analysis so my question can be naive to most of you, I apologize.
I have some left and right censored telemetry data.
I first had to estimate the effect of 3 dummy explanatory variables on survival, so I used an Anderson-Gill model (extension from Cox models). Here, the hazard is a function of Sex, study area (region) and age class (age is divided in 3 age class: kitten (0-1yr), yearling(1-2yrs) and adult(more than 2yrs)). The main “issue” is that age class variable is a time-dependent covariate. Some individuals were followed during some years and thus change age class during the study period. The output of this model provides the effect of each variable I am testing on survival (Sex, region and age class). I could easily predict survival (S(t)) with survfit()
and have predicted curves that are representative of cohorts whose covariates correspond to the values in newdata.
My ultimate goal is to estimate a mean yearly survival rate for each cohort (i.e. male adult from northern region) in order to use these estimates as an input in an age-structured matrix population model. I first thought to make a simple average of survival estimates S(t) I get from survfit()
function but then I realized that time-dependent covariates seem to make the estimation more complicated. I am even wondering whether it’s possible to estimate mean yearly survival rate for each cohort with Cox models. Maybe annual Kaplan-Meier survival estimates should do the trick? Is it possible to obtain it from survfit()
?
Does someone already had to do the same kind of analysis? Here is the R code that summarize what I have done to estimate the effect of each covariate:
#create a surv object in order to adjust a Cox model on dataset
mydata.surv <- Surv(mydata$START,mydata$END, mydata$STATE)
# Andersen-Gill model
fit <- cph(formula = mydata.surv~SEX+AGE_CLASS+STUDYAREA, data=mydata,
method=c("exact"), se.fit=T, linear.predictors=TRUE,
x=T, y=T, surv="summary")
#create new dataset in order to predict survival
newdata <- expand.grid(AGE_CLASS=c(1,2,3),SEX=c("M","F"),
STUDYAREA=c("north","mid","south"))
#predict survival for each cohorts
out <- summary(survfit(fit, newdata = newdata , type = "breslow"))