0

I am working on a meta analysis and a sensitivity analysis for missing data. I want to replace censorsed data either with 0 or 1 according to a predefined probability.

I have a dataset with colum x: timepoints and y: events (1 = event, 0 = censored). For the analysis I replaced some of the 0 with NAs. Z is the indicator for the treatment arm. I want to replace NAs to either 1 or 0 with a predefined probability. This is my code:

Just an example:

library(mice)

x <- c(1:10)
y <- c(1,1,1,NA,NA,NA,1,1,0,NA)
z <- rep(2,10)

data <- data.frame(x,y,z)

str(data)
md.pattern(data)

mice.impute.myfunct <-  function(y, ry, x, ...)
{event <- sample(c(0:1), size = 1, replace=T, prob=c(0.5,0.5)); return(event)}

data.imp <- mice(data, me = c("","myfunct",""), m = 1)
data.comp <- complete(data.imp)

I would expect that NAs in y will be replaced with 0 (20% of cases) and 1 (80% of cases). But NAs are either replaced only with 0 or only with 1.

I have to admit, that I am quite a beginner with R and did not have to write own little functions before.

Thank you very much for your help!

slamballais
  • 3,161
  • 3
  • 18
  • 29
  • I don't get the idea of using mice here, if you are just replacing value with random values, without any covariate. Mice is intended to have an iterative process of imputing data of a variable and its covariates. What do you want to achieve here ? – denis Feb 08 '19 at 10:07
  • Hi Denis, thank you for your post. This is just an example. In the original dataset, I have results from a kaplan-meier curve. I want to do multiple imputations with different assumptions. E.g. 20% of censored patients died, 30% etc... Then I want to create multiple datasets with the respective assumptions and see, if the difference between treatment arms remains. – Andreas Schmitt Feb 08 '19 at 10:36
  • I get it, and for this you don't need the `mice` function. I would advise you to do it "by hand" – denis Feb 08 '19 at 13:13
  • @denis, thank you very much for your answer. It worked for me, as well with your suggestion I could fix the problem withe mice as well and I will see what works out for me. Thanks alot! – Andreas Schmitt Feb 08 '19 at 18:49
  • Glad it helped. don't hesitate to accept the answer if it indeed answered the question – denis Feb 08 '19 at 22:11

1 Answers1

0

Here is a possible solution just replacing the missing values with the 0 and 1, and a varying probability between 0.1 and 0,9:

for( i in seq(0.1,0.9,0.1)){
  data[[paste0("y_imp",i)]] <- data$y
  N <- sum(is.na( data$y))
  data[[paste0("y_imp",i)]][is.na(data[[paste0("y_imp",i)]])] <-  sample(c(0,1), size = N, replace=T, prob=c(i,1-i))
}

data[[paste0("y_imp",i)]] <- data$y create the column where you has the i probability of replacing the missing by 0.

denis
  • 5,580
  • 1
  • 13
  • 40