0

Due to the size of my dataset I'm bound to use Speedlm, fastLm or biglm. Unfortunately I'm stuck to using speedlm as fastlm doesn't have an update function, and biglm only supports single core.

Using speedlm I want to show all residuals. I know that for lm or fastlm I can simply use the residuals() function. However it turns out speedlm doesn't support this.

lmfit  <- speedglm(formula , res)
print(names(lmfit))
[1] "coefficients" "coef"         "df.residual"  "XTX"          "Xy"           "nobs"         "nvar"         "ok"           "A"            "RSS"          "rank"         "pivot"        "sparse"       "yy"           "X1X"          "intercept"    "method"       "terms"        "call"

lmfit <- fastLm(formula, res)
print(names(lmfit))
[1] "coefficients"  "stderr"        "df.residual"   "fitted.values" "residuals"     "call"          "intercept"     "formula"

Is there a way to show all residuals using speedlm?

When attempting to print(residuals(lmfit)) it just prints a NULL

Edit:

When using the method mentioned by @Roland, it returns purely NA's

lmfit  <- speedlm(formula , res, fitted=TRUE)
resids <- res$Daily_gain - predict(lmfit, newdata=res)
print(summary(resids))

# Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's
#   NA      NA      NA     NaN      NA      NA  829780
Bas
  • 1,066
  • 1
  • 10
  • 28

1 Answers1

6
library(speedglm)

Store the fitted value (needs more RAM):

fit <- speedlm(Sepal.Length ~ Species, data = iris, fitted = TRUE)
iris$Sepal.Length - predict(fit)

Or don't store them (needs more CPU time):

fit1 <- speedlm(Sepal.Length ~ Species, data = iris)
iris$Sepal.Length - predict(fit1, newdata = iris)
dickoa
  • 18,217
  • 3
  • 36
  • 50
Roland
  • 127,288
  • 10
  • 191
  • 288
  • @ Roland, thanks it prints them now. However, storing them in RAM gets me the following warning: `Warning messages: 1: In predict.speedlm(rval, data) : prediction from a rank-deficient fit may be misleading 2: In predict.speedlm(lmfit, newdata = res) : prediction from a rank-deficient fit may be misleading` Should I just ignore this? – Bas Oct 19 '15 at 11:26
  • 1
    No, you should not ignore this. – Roland Oct 19 '15 at 11:29
  • @Roland What should I do with the warnings? – Bas Oct 19 '15 at 11:40
  • 2
    Investigate why you have a rank-deficient fit, of course. – Roland Oct 19 '15 at 11:44
  • @Roland I just continued with this code, and found it it purely returns NA values. while with other linear model functions it doesn't return a single `NA`. Any idea on why this can be? – Bas Oct 21 '15 at 06:31