0

I want to make a likelihood ratio test between two models in R.

  • The 2 models are nested
  • I removed all the missing item, creating a clean dataframe.

Even though, when I do anova(mod2,mod3) I get this error message :

Error in anova.coxphlist(c(list(object), dotargs), test = test) :
models were not all fitted to the same size of dataset In addition: Warning message: In anova.coxphlist(c(list(object), dotargs), test = test) : Models with response c(" sep = \"\"))))", " sep = \"\"))))") removed because response differs from model 1

I am sorry, I cannot give a sample of the data I used. I give below the summary of my 2 models :

> mod2
Call:
coxph(formula = Surv(eval(parse(text = paste(j, "_followup", sep = ""))), 
    eval(parse(text = paste(j, sep = "")))) ~ strata(trt) + Var1 + 
    Var2 + Var3 + rcs(Var4, 4), 
    data = data.clean)

                      coef exp(coef)  se(coef)      z       p
Var1               0.030361  1.030827  0.012188  2.491 0.01274
Var2               0.022405  1.022658  0.007463  3.002 0.00268
Var3:1             0.233716  1.263286  0.433950  0.539 0.59018
rcs(Var4, 4)Var4   0.315988  1.371613  0.111433  2.836 0.00457
rcs(Var4, 4)Var4' -0.799272  0.449656  0.378427 -2.112 0.03468

Likelihood ratio test=37.7  on 5 df, p=4.329e-07
n= 375, number of events= 74 
> mod3
Call:
coxph(formula = Surv(eval(parse(text = paste(j, "_followup", sep = ""))), 
    eval(parse(text = paste(j, sep = "")))) ~ strata(trt) + Var1 + 
    Var2 +Var3 + rcs(Var4, 4) + 
    eval(parse(text = paste(i, sep = ""))), data = data.clean)

                        coef exp(coef)  se(coef)      z       p
Var1                0.024004  1.024295  0.012742  1.884 0.05959
Var2                0.018190  1.018356  0.007881  2.308 0.02099
Var3:1              0.348887  1.417489  0.437326  0.798 0.42500
rcs(Var4, 4)Var4    0.289575  1.335859  0.110936  2.610 0.00905
rcs(Var4, 4)Var4'  -0.705976  0.493627  0.377903 -1.868 0.06174
eval(parse(text = paste(i, sep = "")))  0.742461  2.101100  0.304491  2.438 0.01475

Likelihood ratio test=43.69  on 6 df, p=8.499e-08
n= 375, number of events= 74 



>  anova(mod2,mod3)
    Error in anova.coxphlist(c(list(object), dotargs), test = test) : 
      models were not all fitted to the same size of dataset
    In addition: Warning message:
    In anova.coxphlist(c(list(object), dotargs), test = test) :
      Models with response c("    sep = \"\"))))", "    sep = \"\"))))") removed because response differs from model 1

Maybe, it is due to the presence of a strata term ? Or the presence of the splined fitted version of the Var4 ?

Thanks for your suggestion !

Flora Grappelli
  • 659
  • 9
  • 26
  • can you make another variable in data.clean, eg. data.clean$Var5 <- eval(parse(text = paste(i, sep = ""))). And do anova on this? – StupidWolf Oct 28 '19 at 10:52
  • I think it is a good suggestion! I finally found the problem after many searches... It was indeed the use of `eval(parse(text = paste(i, sep = "")))`. The `anova` function uses the formula of the coxph model. If the formula is written in a awkward way, it cannot be re-used in the `anova` function – Flora Grappelli Oct 28 '19 at 13:19
  • @StupidWolf The `data.clean` already includes `Var5`. Since I am using it in a loop, I had to use this non-obvious way of getting the variable in the `data.clean`. thanks anyway !! – Flora Grappelli Oct 28 '19 at 13:21
  • Ok I can kind of guess what you are trying to do. You are trying to loop through different variables. Why don't you try making the formula first, e.g f = as.formula( ) and passing this into coxph. – StupidWolf Oct 28 '19 at 13:55
  • It is exactly what I did ! I changed the formula that I used in my coxph models :-) – Flora Grappelli Oct 28 '19 at 15:29

1 Answers1

0

The problem is that the anova function uses the formula of the coxphfunction. If the formula contains eval, paste, etc. , the anova function may not work

To solve this, we have two solutions :

1) as suggested by @StupidWolf in the comments above, specify f = as.formula(...) after applying coxph on each of the two models.

2 ) calculate the differents elements of anova without the specific function as follows:

Df = sum(anova(mod2)$Df, na.rm = T) - sum(anova(mod1)$Df, na.rm = T)
Chisq = abs(as.numeric(logLik(mod2) - logLik(mod1)) * 2)
pval = pchisq(Chisq, Df, lower.tail=F)
Flora Grappelli
  • 659
  • 9
  • 26