1

I've been working recently with CDISC data that is structured with pre-specified column names and certain expectations for the way survival data are coded.

I want to write a wrapper for survival::Surv() that uses the structured data of the CDISC format. I have a function that is working in most scenarios, but I can't get it to work with survival::coxph().

How can I get my Surv() wrapper function to use default values and work in coxph()? Below are examples using the visR::adtte data set (data set ships with the visR package...install with devtools::install_github("openpharma/visR")), which is in CDISC format. All examples run without issue except the last one.

Surv_CDISC <- function(AVAL, CNSR) {
  # set default values if not passed by user -----------------------------------
  if (missing(AVAL) && exists("AVAL", envir = rlang::caller_env()))
    AVAL <- get("AVAL", envir = rlang::caller_env())
  else if (missing(AVAL))
    stop("Default 'AVAL' value not found. Specify argument in `Surv_CDISC(AVAL=)`.")
  if (missing(CNSR) && exists("CNSR", envir = rlang::caller_env()))
    CNSR <- get("CNSR", envir = rlang::caller_env())
  else if (missing(CNSR))
    stop("Default 'CNSR' value not found. Specify argument in `Surv_CDISC(CNSR=)`.")
  
  # pass args to `survival::Surv()` --------------------------------------------
  survival::Surv(time = AVAL, event = 1 - CNSR)
}


# passing the arguments, everything works
with(visR::adtte, Surv_CDISC(AVAL = AVAL, CNSR = CNSR)) |> head()
#> [1]  2   3   3  28+ 58  46+
# letting the arguments use default value, everything still works
with(visR::adtte, Surv_CDISC()) |> head()
#> [1]  2   3   3  28+ 58  46+


# using function in model.frame() and defining argument values, everything works
model.frame(Surv_CDISC(AVAL, CNSR) ~ SEX, data = visR::adtte) |> head(n = 2)
#>   Surv_CDISC(AVAL, CNSR) SEX
#> 1                      2   F
#> 2                      3   M
# using function in model.frame() with default arguments, everything works
model.frame(Surv_CDISC() ~ SEX, data = visR::adtte) |> head(n = 2)
#>   Surv_CDISC() SEX
#> 1            2   F
#> 2            3   M


# using function in survfit() and defining argument values, everything works
survival::survfit(Surv_CDISC(AVAL, CNSR) ~ SEX, data = visR::adtte)
#> Call: survfit(formula = Surv_CDISC(AVAL, CNSR) ~ SEX, data = visR::adtte)
#> 
#>         n events median 0.95LCL 0.95UCL
#> SEX=F 143     80     64      47      96
#> SEX=M 111     72     41      30      57
# using function in survfit() with default arguments, everything works
survival::survfit(Surv_CDISC() ~ SEX, data = visR::adtte)
#> Call: survfit(formula = Surv_CDISC() ~ SEX, data = visR::adtte)
#> 
#>         n events median 0.95LCL 0.95UCL
#> SEX=F 143     80     64      47      96
#> SEX=M 111     72     41      30      57


# using function in coxph() and defining argument values, everything works
survival::coxph(Surv_CDISC(AVAL, CNSR) ~ SEX, data = visR::adtte)
#> Call:
#> survival::coxph(formula = Surv_CDISC(AVAL, CNSR) ~ SEX, data = visR::adtte)
#> 
#>        coef exp(coef) se(coef)     z     p
#> SEXM 0.3147    1.3699   0.1626 1.935 0.053
#> 
#> Likelihood ratio test=3.71  on 1 df, p=0.05412
#> n= 254, number of events= 152
# DOES NOT WORK TRYING TO RELY ON DEFAULT VALUES
survival::coxph(Surv_CDISC() ~ SEX, data = visR::adtte)
#> Error in x[[2]]: subscript out of bounds

Created on 2022-06-05 by the reprex package (v2.0.1)

Daniel D. Sjoberg
  • 8,820
  • 2
  • 12
  • 28
  • I haven't debugged your code carefully, but using `caller_env()` for the default looks like a likely source of errors. How do you know which function will call `Surv_CDISC()`? It would be much safer to set a global variable for the source, and using globals is generally dangerous. – user2554330 Jun 05 '22 at 18:19
  • Many many functions could call `Surv_CDISC()`, just like `Surv()`. I agree global variables are dangerous, and wouldn't be a solution in this situation. In the `coxph()` example, it seems that AVAL and CNSR are found (otherwise, there would be an error). It is strange to me that `Surv_CDISC()` seems to run without error, but the error appears later in the processing of `coxph()`. – Daniel D. Sjoberg Jun 05 '22 at 18:25
  • Functions in formulas aren't evaluated immediately, they are evaluated by the functions that use the formula, generally in a strange context (e.g. contents of `data` are available in the environment). – user2554330 Jun 05 '22 at 18:28
  • You get data from the `visR` package, but it's not on CRAN. How do you install it? – user2554330 Jun 05 '22 at 18:29
  • I see now: `visR` was removed from CRAN yesterday. – user2554330 Jun 05 '22 at 18:31
  • Updated post with install instructions, `devtools::install_github("openpharma/visR")` – Daniel D. Sjoberg Jun 05 '22 at 18:33
  • I checked the internals of `survival::coxph()`, and they are parsing the LHS of the formula, e.g. `formula[1:2]`. This works well when the inputs are explicitly added to the function. But if the formula is trying to rely on defaults in the way I've written (ie they are missing and added later), there is nothing to parse! I wonder if they could be explicitly added in the function definition as the defaults in some way? – Daniel D. Sjoberg Jun 05 '22 at 18:38

1 Answers1

0

This looks like a bug in the survival package, or maybe a mis-use of it (I'm not so familiar with the internals).

EDITED TO ADD A COMMENT:

I think the analysis below is wrong. Reading the code more carefully, I think the current code in the survival package won't work reliably unless you use the explicit Surv(AVAL, CNSR) in the formula.

HERE'S THE ANALYSIS THAT APPEARED TO WORK, BUT I DON'T TRUST IT:

The problem is that survival:::terms.inner looks specifically for a function named Surv, here: https://github.com/therneau/survival/blob/b5238a42867a931954cf222b871a7b3a1c2fcd24/R/xtras.R#L65 . Since your function has a different name, it's not handled as if it is the same thing.

You could fix this by naming your function Surv as well. When I do that, things appear to work. Of course, this may cause problems elsewhere when you want the original Surv without the survival:: prefix, but I don't know a way to fix that.

I'd still worry about using caller_env(). Here's how I'd create your fake Surv:

make_surv_CDISC <- function(defaults) {

  force(defaults)
  
  function(AVAL = defaults$AVAL, 
           CNSR = defaults$CNSR) {
    
  # pass args to `survival::Surv()` --------------------------------------------
    survival::Surv(time = AVAL, event = 1 - CNSR)
  }
}

Surv <- make_surv_CDISC(visR::adtte)

This is less general than yours, but I think it's safer.

user2554330
  • 37,248
  • 4
  • 43
  • 90
  • Awesome, thanks. I don't think the function is _required_ to be called `Surv()`, because my example above is called `Surv_CDISC()` and it works when the arguments are specified. I will look further into the internals....see what I can find. I think the error arises because the function on the LHS of the formula is being parsed in `terms.inner()`, but the call doesn't have args to parse. – Daniel D. Sjoberg Jun 05 '22 at 20:08
  • Reading through `terms.inner`, I can't see why it would have worked when `Surv_CDISC` was renamed to `Surv`. So maybe it didn't actually work, it just looked like it did. I posted this as an issue on the `survival` Github site. – user2554330 Jun 05 '22 at 20:36
  • OH sorry, I meant that it worked when I used `Surv_CDISC(AVAL, CNSR) ~ SEX`, suggesting that the function does _not_ need to be called `Surv()`. Rather, I still think the issue is that `terms.inner()` is parsing the arguments in `Surv_CDISC()` and when it finds no arguments (because I am trying to make it work with default values), there is a subscript error because there are no arguments and the subscripts are out of bounds. – Daniel D. Sjoberg Jun 05 '22 at 20:53
  • I was reading the comments above the `terms.inner()` function definition: "This is used to generate a warning in coxph if the same variable is used on both sides, so perfection is not required of the function." So this function is only used to signal a potential syntax error to users, and is not a part of the `coxph()` estimation. I will prepare a pull request for this case and see if they are amenable to merging it. – Daniel D. Sjoberg Jun 05 '22 at 20:58
  • thanks for convo, it really helped diagnose the issue! FYI, I submitted a PR that would fix the issue. I hope they consider merging it or implementing something similar that will allow for the syntax I would like to use. https://github.com/therneau/survival/pull/200 – Daniel D. Sjoberg Jun 05 '22 at 21:19