0

Found a similar question here, but it is not full.

My question is split in 2 :

  1. I want to store a "slim" version of an R lm() object as text in a DBMS.
  2. I want to be able to produce predictions out of the text object I saved.

By "slim" I mean with just the right amount of data that the predict() function won't fail. I want to store the model becuase learning sometimes takes a lot of time, for example :

lmFull <- lm(Volume~Girth+Height,data=trees)
lmSlim <- lmFull
lmSlim$fitted.values <- lmSlim$qr$qr <- lmSlim$residuals <- lmSlim$model <- lmSlim$effects <- NULL
pred1 <- predict(lmFull,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))
pred2 <- predict(lmSlim,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))
identical(pred1,pred2)
[1] True

What I have done to store as text is take the lmSlim object and deparse it :

lmTxt <- deparse(lmSlim)
lmTxt <- paste0(lmTxt,collapse="")

Storing this in the the DB is easy, but when I want to reuse it again :

lmRst <- eval(parse(text=lmTxt))
class(lmRst)
[1] "lm"
predict(lmRst,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))
Error in eval(expr, envir, enclos) : object 'Volume' not found

Any suggestions?

Community
  • 1
  • 1

3 Answers3

1

I've solved the issue, might be a bit of a workaround but it works :

# learning and reducing the size of output
lmFull <- lm(Volume~Girth+Height,data=trees)
lmSlim <- lmFull
lmSlim$fitted.values <- lmSlim$qr$qr <- lmSlim$residuals <- lmSlim$model <- lmSlim$effects <- NULL
pred1 <- predict(lmFull,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))
pred2 <- predict(lmSlim,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))
identical(pred1,pred2)
[1] TRUE

# deparse and collapse into a string
lmTxt <- deparse(lmSlim)
lmTxt <- paste0(lmTxt,collapse="")

# re-parsing
lmParsed <- eval(parse(text=lmTxt))
lmParsed$call <- lmFull$call
lmParsed$terms <- lmFull$terms
lmParsed
pred3 <- predict(lmParsed,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))
identical(pred1,pred3)
[1] FALSE

But...

sum(abs(pred1 - pred3))
[1] 1.634248e-13
as.numeric(object.size(lmParsed) / object.size(lmFull))
[1] 0.3449477

So I can live with it.

0

Try this:

lmTxt <- dput(lmSlim)
lmRst <- eval(lmTxt)
predict(lmRst,newdata=data.frame(Girth=c(1,2,3),Height=c(2,3,4)))

Edit: as pointed out in the comments, dput does not return a string. So here's another option:

save(lmSlim, file='data.txt', ascii=T)

The contents of the file are ascii so it should be possible to write them to a database. To later reload just use the load command:

load('data.txt')
jpetterson
  • 56
  • 4
  • Thanks. But, dput() has the same class as the object it gets. I need to save it as character. – Aviad Klein Feb 25 '14 at 09:49
  • Right, sorry for that! I edited the answer with another option, please see above. – jpetterson Feb 25 '14 at 10:46
  • dput does yield a text representation of the lm-object. the example above would be more clear with tmpfile <- tempfile(); dput(lmSlim, tmpfile); fitted(dget(tmpfile)); coef(dget(tmpfile)) etc... – fabians Feb 25 '14 at 10:48
0

Don't store it as text. Try this:

lmFull <- lm(Volume~Girth+Height,data=trees)
lmSlim <- lmFull
lmSlim$residuals <- NULL
lmSlim$effects <- NULL
lmSlim$fitted.values <- NULL
lmSlim$model <- NULL
lmSlim$qr$qr <- NULL
predict(lmSlim)
#works
predict(lmSlim, newdata=data.frame(Girth=30, Height=20))
#works

object.size(lmFull)
#22960 bytes
object.size(lmSlim)
#7920 bytes
Roland
  • 127,288
  • 10
  • 191
  • 288
  • Please read the question thoroughly, I **need** to store as text in a DB. I know how to assign objects to variables, this is not the issue. I need a way to write "some text" that when I parse it it will be a slim version of a model that I can predict by. – Aviad Klein Feb 25 '14 at 10:27
  • Well, in principle, for prerdictions you only need the coefficients and the model formula. So, you could write your own `predict` function. – Roland Feb 25 '14 at 10:53
  • Thanks @Roland, I agree, but I writing my own prediction function will not scale for other types of fitting functions, 2 will never be as efficient as the R built in functions. don't you agree? – Aviad Klein Feb 26 '14 at 04:50