1

The following example is modified from document example by adding "age"

d.coxph <- (survfit(Surv(time, status) ~ sex+age, data = lung))
autoplot(d.coxph)

I'll get the following error:

Error in levels<-(*tmp*, value = if (nl == nL) as.character(labels) else paste0(labels, : factor level [41] is duplicated

Enter a frame number, or 0 to exit

1: autoplot(d.coxph)> Error in levels<-(*tmp*, value = if (nl == nL) as.character(labels) else paste0(labels, : factor level [41] is duplicated

Enter a frame number, or 0 to exit

1: autoplot(d.coxph) 2: autoplot.survfit(d.coxph) 3: fortify(object, surv.connect = surv.connect, fun = fun) 4: fortify.survfit(object, surv.connect = surv.connect, fun = fun) 5: factor(rep(groupIDs, model$strata), levels = groupIDs)

2: autoplot.survfit(d.coxph) 3: fortify(object, surv.connect = surv.connect, fun = fun) 4: fortify.survfit(object, surv.connect = surv.connect, fun = fun) 5: factor(rep(groupIDs, model$strata), levels = groupIDs)

hlu58
  • 115
  • 6
  • I don't think that formula is what you want, look at `plot(d.coxph)` and after minimal testing, it doesn't seem like `autoplot` supports a survfit formula with two variables on the right – rawr May 01 '18 at 02:55

1 Answers1

3

A possible solution is to divide the continuous variable age into categories and to consider the interaction between sex and age in the survfit formula:

library(survival)
data(lung)
lung$age2cat <- cut(lung$age,breaks=2)
lung$sex <- factor(lung$sex, labels=c("F","M"))
d.coxph <- survfit(Surv(time, status) ~ interaction(sex,age2cat), data = lung)

autoplot(d.coxph, conf.int=F, surv.size=1)

enter image description here

Marco Sandri
  • 23,289
  • 7
  • 54
  • 58
  • "interaction" function did the trick. My real problem has two categorical variables. Without "interaction", autoplot fails with the same error as the example I provided in the question; but "plot" function in survival package works fine. – hlu58 May 02 '18 at 03:29
  • @hlu58 If you avoid to divide `age` into categories and use it as it is, you will get with `interaction`+`autoplot` the same plot given by `plot` (71 survival curves). Why using `interaction` is a problem for you? – Marco Sandri May 02 '18 at 09:54
  • No, it is not a problem; just a surprise. Thanks for your answer. – hlu58 May 02 '18 at 23:05
  • @MarcoSandri The problem with `interaction` is that the resulting data frame will not have two columns for the two strata, rather, it'll have a single one, with the names concatenated with dots. That *is* a problem if you want to post-process the data frame (you'll have to split the column, re-add it to the data frame, name it etc.). It'd much more elegant if `ggfortify` would work in a logical way, i.e., by adding a new column for each covariate. – Tamas Ferenci Feb 23 '20 at 12:55