0

For a project I want to do a robust regression with the 'robust' package in R. The data consists of prevalences of certain mutations on both X and Y axis so I used the binomial family. The problem is that whenever I try to calculate the confidence intervals I get an error:

Error in predict.glmRob(mod, newdata = dfPred, type = "response") :
attempt to apply non-function

This is the R-code that I ran:

mod      <- glmRob(pop2 ~ pop1, data=df, family=binomial)
xweights <- seq(0, 0.2, 0.001)
dfPred   <- data.frame(pop1 = xweights)
yweights <- predict(mod, newdata=dfPred, type="response")

And these are the data:

            pop2           pop1
1   0.0000000000    0.006656805
2   0.0023738872    0.027366864
3   0.0071216617    0.054733728
4   0.0029673591    0.030325444
5   0.0094955490    0.175295858
6   0.0000000000    0.022189349
7   0.0005934718    0.019970414
8   0.0000000000    0.011834320
9   0.0011869436    0.023668639
10  0.0053412463    0.159763314
11  0.0005934718    0.070266272
12  0.0000000000    0.014792899
13  0.0077151335    0.154585799
14  0.0005934718    0.003698225
15  0.0011869436    0.062130178
16  0.0017804154    0.025147929
17  0.0071216617    0.053254438
18  0.0136498516    0.196745562

I found someone to help me extract the confidence intervals from the model with the following code but then I get confidence intervals ranging from 0 to 1 which doesn't happen in the non-robust glm or when another family is selected.

mod          <- glmRob(pop2 ~ pop1, data=df, family=binomial)
yweights     <- fitted.values(mod)
coefficients <- coef(mod)
se           <- coef(summary(mod))[,2]
intercept    <- as.numeric(coefficients[1] + c(-2*se[1], 0, 2*se[1]))
slope        <- as.numeric(coefficients[2] + c(-2*se[2], 0, 2*se[2]))
lci          <- intercept[1]+slope[1]*df$pop1
    uci          <- intercept[3]+slope[3]*df$pop1

Does anyone have an idea how to solve this?

Tubeman
  • 71
  • 1
  • 5
  • Your approach for calculating the intervals is flawed. It doesn't even consider correlation between the parameter estimates. You could try jackknife or bootstrap approaches instead (although I'm not 100 % sure that they play well with a robust GLM). However, I can't get your code to run without errors. – Roland Mar 15 '16 at 12:08
  • Your `pop2` is just a fraction. You are using the `binomial` family, which requires the number of 'successes' & the number of 'failures'. I've never tried `glmRob`, but it may be able to work with a `weights` argument (as `glm` does) that would indicate the total number of Bernoulli trials. Do you have that information? Have you tried incorporating it? Can you paste it in here? – gung - Reinstate Monica Mar 15 '16 at 15:12
  • Yes, this is also something that I tried but it gives the same error. For every value in my table the number of 'trials' is 1685. I appended a column with the weights to the table (by `cbind(df, c(rep(1685, 18)`) and ran glmRob with that column specified as weights. The model is created successfully again but the calculation of the confidence intervals gives either an error or the range [0,1] depending on the code used. – Tubeman Mar 15 '16 at 16:39

0 Answers0