Stratified log-rank test in R for counting process form data?

Question

Background: at half-year follow up times for 4y, patients may switch to a different medication group. To account for this, I've converted survival data into counting process form. I want to compare survival curves for medication groups A, B, and C. I am using an extended Cox model but want to do pairwise comparisons of each hazard function or do stratified log-rank tests. pairwise_survdiff throws an error because of the form of my data, I think.

Example data:

x<-data.frame(tstart=rep(seq(0,18,6),3),tstop=rep(seq(6,24,6),3), rx = rep(c("A","B","C"),4), death=c(rep(0,11),1))
x

Problem:

When using survdiff in the survival package,

survdiff(Surv(tstart,tstop,death) ~ rx, data = x)

I get the error:

Error in survdiff(Surv(tstart, tstop, death) ~ rx, data = x) : 
  Right censored data only

I think this stems from the counting process form, since I can't find an example online that compares survival curves for time-varying covariates.

Question: is there a quick fix to this problem? Or, is there an alternative package/function with the same versatility to compare survival curves, namely using different methods? How can I implement stratified log-rank tests using survidff on counting process form data?

NOTE: this was marked as a known issue in the survminer package, see github issue here, but updating survminer did not solve my issue, and using one time interval, tstop-tstart wouldn't be correct, since that would leave, e.g., multiple entries at 6 months rather than out to the actual interval of risk.

Conceptually the problem is that when you have time-dependent covariates, you can have a large number of potential pairwise comparisons, A,A,B,C, vs. A,B,C,A vs. .... It seems you want comparisons of A,A,A,A vs. B,B,B,B vs. C,C,C,C but there is nothing special about these treatment profiles and if these treatment profiles don't appear in your data, these particular comparisons would be rather hypothetical. It may be more useful to fit the model an make a multiple comparison of estimated coefficients for each pair of treatment effects? — adibender, Apr 01 '19 at 17:26
I see, thanks for responding. Would implementing that correspond to splitting the data into a group where medication group A appears, making a cox model, then comparing it by, e.g., anova, to a cox model regressed on group B? Something like, `aov(fitA, fitB)` for `fitA<-cox(Surv(tstart,tstop,death) ~ rx=='A, data = x')`? I don't think `survival` can stratify in this way but I could split the `x` dataframe up. — neal, Apr 01 '19 at 17:36
I think you should be able to just replace `survdiff` with `coxph` and fit the model. if your data is in the format of `x` above, its already in the correct format. — adibender, Apr 01 '19 at 20:18
I will try to post an example of how to make multiple comparisons for the coefficients in a bit. — adibender, Apr 01 '19 at 20:19
Awesome, thank you so much! I have been struggling to find a good answer. I was partly confused because I was later trying something like, `cox.fit<-coxph(Surv(tstart, tstop, death) ~ age + strata(rx)` which omits each medication group from a hazard ratio. Then, I was wondering how you could compare hazard functions for each group, A, B, C. There is nothing like `cox.fitA<-coxph(Surv(tstart, tstop, death) ~ age + strata(rx)==A`, right? Anyway, the `coxph` function does work in the form I've got but I wanted to discuss the differences between medication groupings. — neal, Apr 01 '19 at 21:10
When you use strata, each group gets its own baseline hazard, which doesn't really make sense in my opinion for time-dependent covariates — adibender, Apr 01 '19 at 22:02

score 0 · Accepted Answer · answered Apr 01 '19 at 21:59

So, here is an example of fitting the model and making the multiple comparisons using multcomp package. Note that this implicitly assumes that administration of treatments A-C is random. Depending on the assumptions about the process, it might be better to fit a multistate model with transitions between treatments and outcome.

library(purrr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(survival)
library(multcomp)
#> Loading required package: mvtnorm
#> Loading required package: TH.data
#> Loading required package: MASS
#> 
#> Attaching package: 'MASS'
#> The following object is masked from 'package:dplyr':
#> 
#>     select
#> 
#> Attaching package: 'TH.data'
#> The following object is masked from 'package:MASS':
#> 
#>     geyser
# simulate survival data
set.seed(123)
n <- 200
df <- data.frame(
  id = rep(1:n, each = 8),
  start = rep(seq(0, 42, by = 6), times = 8),
  stop = rep(seq(6, 48, by = 6), times = 8),
  rx = sample(LETTERS[1:3], n * 8, replace = T))
df$hazard <- exp(-3.5  -1 * (df$rx == "A") + .5 * (df$rx == "B") +
  .5 * (df$rx == "C"))

df_surv <- data.frame(id = 1:n)
df_surv$time <- split(df, f = df$id) %>%
  map_dbl(~msm::rpexp(n = 1, rate = .x$hazard, t = .x$start))

df <- df %>% left_join(df_surv)
#> Joining, by = "id"
df <- df %>%
  mutate(status = 1L * (time <= stop)) %>%
  filter(start <= time)
df %>% head()
#>   id start stop rx     hazard     time status
#> 1  1     0    6  A 0.01110900 13.78217      0
#> 2  1     6   12  C 0.04978707 13.78217      0
#> 3  1    12   18  B 0.04978707 13.78217      1
#> 4  2     0    6  B 0.04978707 22.37251      0
#> 5  2     6   12  B 0.04978707 22.37251      0
#> 6  2    12   18  C 0.04978707 22.37251      0

# fit the model 
model <- coxph(Surv(start, stop, status)~rx, data = df)

# define pairwise comparison
glht_rx <- multcomp::glht(model, linfct=multcomp::mcp(rx="Tukey"))
glht_rx
#> 
#>   General Linear Hypotheses
#> 
#> Multiple Comparisons of Means: Tukey Contrasts
#> 
#> 
#> Linear Hypotheses:
#>            Estimate
#> B - A == 0  1.68722
#> C - A == 0  1.60902
#> C - B == 0 -0.07819

# perform multiple comparisons 
# (adjusts for multiple comparisons + takes into account correlation of coefficients -> more power than e.g. bonferroni)
smry_rx <- summary(glht_rx)
smry_rx # -> B and C different to A, but not from each other
#> 
#>   Simultaneous Tests for General Linear Hypotheses
#> 
#> Multiple Comparisons of Means: Tukey Contrasts
#> 
#> 
#> Fit: coxph(formula = Surv(start, stop, status) ~ rx, data = df)
#> 
#> Linear Hypotheses:
#>            Estimate Std. Error z value Pr(>|z|)    
#> B - A == 0  1.68722    0.28315   5.959   <1e-05 ***
#> C - A == 0  1.60902    0.28405   5.665   <1e-05 ***
#> C - B == 0 -0.07819    0.16509  -0.474     0.88    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> (Adjusted p values reported -- single-step method)
# confidence intervals
plot(smry_rx)

^{Created on 2019-04-01 by the reprex package (v0.2.1)}

This is super helpful and very thorough! Thank you so much--exactly what I needed. — neal, Apr 02 '19 at 01:03

Stratified log-rank test in R for counting process form data?

1 Answers1

Linked

Related