1

I am trying to run an interval regression using the survival r package (as described here https://stats.oarc.ucla.edu/r/dae/interval-regression/), but I am running into difficulties when trying to pool results across multiply imputed datasets. Specifically, although estimates are returned, I get the following error: log(1 - 2 * pnorm(width/2)) : NaNs produced. The estimates seem reasonable, at face value (no NaNs, very large or small SEs).

I ran the same model on the stacked dataset (ignoring imputations) and on individual imputed datasets, but in either case, I do not get the error. Would someone be able to explain to me what is going on? Is this an ignorable error? If not, is there a workaround that avoids this error?

Thanks so much!

# A Reproducible Example

require(survival)
require(mice)
require(car)

# Create DF
dat <- data.frame(dv = c(1, 1, 2, 1, 0, NA, 1, 4, NA, 0, 3, 1, 3, 0, 2, 1, 4, NA, 2, 4),
                  catvar1 = factor(c(0, 0, 0, 0, 0, 1, 0, 0, 0, NA, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0)),
                  catvar2 = factor(c(1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, NA, 0)))

dat_imp <- mice(data = dat)

# Transform Outcome Var for Interval Reg
dat_imp_long <- complete(dat_imp, action = "long", include=TRUE)

# 1-4 correspond to ranges (e.g., 1 = 1 to 2 times...4 = 10 or more)
# create variables that reflect this range
dat_imp_long$dv_low <- car::recode(dat_imp_long$dv, "0 = 0; 1 = 1; 2 = 3; 3 = 6; 4 = 10")
dat_imp_long$dv_high <- car::recode(dat_imp_long$dv, "0 = 0; 1 = 2; 2 = 5; 3 = 9; 4 = 999")
dat_imp_long$dv_high[dat_imp_long$dv_high > 40] <- Inf

# Convert back to mids
dat_mids <- as.mids(dat_imp_long)

# Run Interval Reg 
model1 <- with(dat_mids, survreg(Surv(dv_low, dv_high, type = "interval2") ~ 
                                     catvar1 + catvar2, dist = "gaussian"))

# Warning message for both calls: In log(1 - 2 * pnorm(width/2)) : NaNs produced
# Problem does not only occur with pool, but summary
summary(model1)
summary(pool(model1))

# Run Equivalent Model on Individual Datasets
# No errors produced
imp1 <- subset(dat_imp_long, .imp == 1)
model2 <- survreg(Surv(dv_low, dv_high, type = "interval2") ~ 
                       catvar1 + catvar2, dist = "gaussian", data = imp1)
summary(model2)

imp2 <- subset(dat_imp_long, .imp == 2)
model3 <- survreg(Surv(dv_low, dv_high, type = "interval2") ~ 
                    catvar1 + catvar2, dist = "gaussian", data = imp2)
summary(model3)

# Equivalent Analysis on Stacked Dataset
# No error
model <- with(dat_imp_long, survreg(Surv(dv_low, dv_high, type = "interval2") ~ 
                                  catvar1 + catvar2, dist = "gaussian"))
summary(model)
Rachel
  • 33
  • 5
  • I only get a warning with your code, not an error. I wonder if this question might be more appropriate on [Cross Validated](https://stats.stackexchange.com/) as this doesn't seem to be an issue with programming. It looks like there's an issue calculating the deviance when pooling, but I'm not familiar with interval regression to know why. – TrainingPizza Feb 16 '22 at 21:22
  • Sorry, yes, you're right, it's a warning not an error. I can post it on Cross Validated, if it would be more appropriate there – Rachel Feb 16 '22 at 22:11
  • I think you're more likely to receive a response there as I don't immediately see anything wrong with the programming (although as I noted I am not very familiar with interval regression) – TrainingPizza Feb 16 '22 at 22:22

0 Answers0