0

I want to impute some missing data in table, and run the Cox Model on the imputed table.

I can get the imputation to run on my data, and the cox model to run on the imputed data, but I don't understand how to view the cox output from the data set, some of whose values were imputed (i.e. I specifically need the hazard ratios and P Values in my output).

The commands are:

>library("mice")
>Table <-read.table("TestTable",stringsAsFactors=TRUE,header=TRUE)

Then I make sure my relevent variables are factors (e.g. cohort can be 0 or 1, to make sure these are seen as different categories).

> Table$Cohort <-as.factor(Table$Cohort)
> Table$Sex <-as.factor(Table$Sex)
> Table$Type <-as.factor(Table$Type)
> Table$Grade <-as.factor(Table$Grade)
> Table$Comorbidity <-as.factor(Table$Comorbidity)
> Table$SNP1 <-as.factor(Table$SNP1)
> Table$SNP2 <-as.factor(Table$SNP2)

Then I relevel the factors to make the Cox model easier to intepret later on:

>Table$SNP1 <-relevel(Table$SNP1,"WT")
>Table$SNP2 <-relevel(Table$SNP2,"WT")
>Table$Grade <-relevel(Table$Grade,"1")
>Table$Comorbidity <-relevel(Table$Comorbidity,"1")

Then I imputed the data: polyreg for categorical data with more than two levels, logreg for factors with 3 levels.

imp <-mice(Table,maxit=5,seed=12345,me=c("","","","","","","","","","","","polyreg","polyreg","logreg","logreg"))

Then, I ran the Cox model to run on the imputed data set:

library("survival")
Table$Survival <-as.numeric(Table$Survival)
cox_with_imp <- with(imp,coxph(Surv(Survival,Event)~strata(Cohort) + strata(Grade) + strata(Comorbidity) + factor(SNP1) + factor(SNP2)))

The output is 5 cox model analyses. I'm having trouble pooling the information together. When I type "pool(cox_with_imp)", it gives me some statistics. But I want a "pooled" table with HR and P values.

Would anyone know the command I type to pool the 5 imputed Cox models into one consensus Cox model with HR and P Values.

Thanks.

user1288515
  • 195
  • 1
  • 10

2 Answers2

0

You cannot combine these p-values directly to get valid inferences, because under the null hypothesis these p-values are uniformly distributed and Rubin’s combining rules require a normal distribution or a t-distribution.

Piet
  • 26
  • 3
0

You can write your own function to get the HR though, by just exponentiating the regression coefficients.

What piet said seems right. The p values can't be got, but you can find a value indicating the probability that the coefficient is 0. This is given by the Pr(>|t|) column. See the van Burren book page 45 for the theory on this.