0

A minimal example is given as below. Why the last row (the last level of combination) got NA? I thought it should be treated as a reference level, am I right? Which is the reference level in the combination?

> library(survival)
> summary(coxph(Surv(time, status) ~ factor(ph.ecog) * factor(sex), data = lung))
Call:
coxph(formula = Surv(time, status) ~ factor(ph.ecog) * factor(sex), 
    data = lung)

  n= 227, number of events= 164 
   (因为不存在,1个观察量被删除了)

                                  coef exp(coef) se(coef)      z Pr(>|z|)   
factor(ph.ecog)1               0.40410   1.49796  0.23525  1.718  0.08584 . 
factor(ph.ecog)2               0.84691   2.33242  0.26889  3.150  0.00163 **
factor(ph.ecog)3               2.01808   7.52390  1.03086  1.958  0.05027 . 
factor(sex)2                  -0.67710   0.50809  0.38435 -1.762  0.07812 . 
factor(ph.ecog)1:factor(sex)2  0.07719   1.08024  0.45168  0.171  0.86432   
factor(ph.ecog)2:factor(sex)2  0.32936   1.39008  0.49576  0.664  0.50646   
factor(ph.ecog)3:factor(sex)2       NA        NA  0.00000     NA       NA   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

                              exp(coef) exp(-coef) lower .95 upper .95
factor(ph.ecog)1                 1.4980     0.6676    0.9446     2.375
factor(ph.ecog)2                 2.3324     0.4287    1.3770     3.951
factor(ph.ecog)3                 7.5239     0.1329    0.9976    56.743
factor(sex)2                     0.5081     1.9682    0.2392     1.079
factor(ph.ecog)1:factor(sex)2    1.0802     0.9257    0.4457     2.618
factor(ph.ecog)2:factor(sex)2    1.3901     0.7194    0.5261     3.673
factor(ph.ecog)3:factor(sex)2        NA         NA        NA        NA

Concordance= 0.643  (se = 0.025 )
Likelihood ratio test= 30.07  on 6 df,   p=0.00004
Wald test            = 29.75  on 6 df,   p=0.00004
Score (logrank) test = 32.92  on 6 df,   p=0.00001

More strange is that when focus on the interaction, one or more last levels would got NA.

> summary(coxph(Surv(time, status) ~ factor(ph.ecog) : factor(sex), data = lung))
Call:
coxph(formula = Surv(time, status) ~ factor(ph.ecog):factor(sex), 
    data = lung)

  n= 227, number of events= 164 
   (因为不存在,1个观察量被删除了)

                                  coef exp(coef) se(coef)      z Pr(>|z|)   
factor(ph.ecog)0:factor(sex)1 -0.49916   0.60704  0.31688 -1.575  0.11520   
factor(ph.ecog)1:factor(sex)1 -0.09506   0.90932  0.28525 -0.333  0.73894   
factor(ph.ecog)2:factor(sex)1  0.34774   1.41587  0.31502  1.104  0.26964   
factor(ph.ecog)3:factor(sex)1  1.51892   4.56729  1.04054  1.460  0.14436   
factor(ph.ecog)0:factor(sex)2 -1.17627   0.30843  0.41739 -2.818  0.00483 **
factor(ph.ecog)1:factor(sex)2 -0.69498   0.49909  0.31614 -2.198  0.02793 * 
factor(ph.ecog)2:factor(sex)2       NA        NA  0.00000     NA       NA   
factor(ph.ecog)3:factor(sex)2       NA        NA  0.00000     NA       NA   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

                              exp(coef) exp(-coef) lower .95 upper .95
factor(ph.ecog)0:factor(sex)1    0.6070     1.6473    0.3262    1.1296
factor(ph.ecog)1:factor(sex)1    0.9093     1.0997    0.5199    1.5904
factor(ph.ecog)2:factor(sex)1    1.4159     0.7063    0.7636    2.6252
factor(ph.ecog)3:factor(sex)1    4.5673     0.2189    0.5942   35.1048
factor(ph.ecog)0:factor(sex)2    0.3084     3.2422    0.1361    0.6989
factor(ph.ecog)1:factor(sex)2    0.4991     2.0037    0.2686    0.9274
factor(ph.ecog)2:factor(sex)2        NA         NA        NA        NA
factor(ph.ecog)3:factor(sex)2        NA         NA        NA        NA

Concordance= 0.643  (se = 0.025 )
Likelihood ratio test= 30.07  on 6 df,   p=0.00004
Wald test            = 29.75  on 6 df,   p=0.00004
Score (logrank) test = 32.92  on 6 df,   p=0.00001

UPDATE

Thanks the operation suggested by @rawr, I checked my real data and found it cannot be explained by this:

> with(data_case, ftable(class, Grouping, OS_Status))
               OS_Status  0  1
class Grouping                
A     a                  33 14
      b                  22 26
B     a                  49 21
      b                  43 28
C     a                  86 25
      b                  77 42

> fit = coxph(Surv(OS, OS_Status) ~ class:Grouping, data_case) 
> summary(fit)
Call:
coxph(formula = Surv(OS, OS_Status) ~ class:Grouping, data = data_case)

  n= 466, number of events= 156 

                    coef exp(coef) se(coef)      z Pr(>|z|)  
classA:Groupinga -0.3504    0.7044   0.3099 -1.131   0.2582  
classB:Groupinga -0.4621    0.6299   0.2695 -1.715   0.0864 .
classC:Groupinga -0.6477    0.5232   0.2534 -2.556   0.0106 *
classA:Groupingb  0.3717    1.4502   0.2503  1.485   0.1376  
classB:Groupingb -0.1213    0.8858   0.2455 -0.494   0.6213  
classC:Groupingb      NA        NA   0.0000     NA       NA  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

                 exp(coef) exp(-coef) lower .95 upper .95
classA:Groupinga    0.7044     1.4196    0.3838    1.2930
classB:Groupinga    0.6299     1.5875    0.3714    1.0684
classC:Groupinga    0.5232     1.9112    0.3184    0.8598
classA:Groupingb    1.4502     0.6896    0.8879    2.3685
classB:Groupingb    0.8858     1.1289    0.5475    1.4331
classC:Groupingb        NA         NA        NA        NA

Concordance= 0.58  (se = 0.027 )
Likelihood ratio test= 16.48  on 5 df,   p=0.006
Wald test            = 16.74  on 5 df,   p=0.005
Score (logrank) test = 17.5  on 5 df,   p=0.004

Shixiang Wang
  • 2,147
  • 2
  • 24
  • 33
  • 1
    see `with(lung, ftable(ph.ecog, sex, status))`, there are no observations that fit that profile – rawr Apr 21 '22 at 05:18
  • @rawr It seems can explain the dataset above, but my real data has observations for last level. – Shixiang Wang Apr 21 '22 at 11:54
  • 1
    well for the updated case in your data, you notice that there are 6 levels of the interaction and 6 rows in the table so the reference is `C:b` and no estimates for it. you can convince yourself of this by making an interaction variable `class_group <- relevel(interaction(class, Grouping), ref = 'C.b')` and fitting the model with that instead – rawr Apr 21 '22 at 16:27
  • @rawr Thanks :). Yeah, it's similar, the last level will get NAs. So reference can explain this. – Shixiang Wang May 18 '22 at 00:45

0 Answers0