3

I am trying to use R to fit a linear model and make predictions. My model includes some constant side parameters that are not in the data frame. Here's a simplified version of what I'm doing:

dat <- data.frame(x=1:5,y=3*(1:5))
b <- 1
mdl <- lm(y~I(b*x),data=dat)

Unfortunately the model object now suffers from a dangerous scoping issue: lm() does not save b as part of mdl, so when predict() is called, it has to reach back into the environment where b was defined. Thus, if subsequent code changes the value of b, the predict value will change too:

y1 <- predict(mdl,newdata=data.frame(x=3)) # y1 == 9
b <- 5
y2 <- predict(mdl,newdata=data.frame(x=3)) # y2 == 45

How can I force predict() to use the original b value instead of the changed one? Alternatively, is there some way to control where predict() looks for the variable, so I can ensure it gets the desired value? In practice I cannot include b as part of the newdata data frame, because in my application, b is a vector of parameters that does not have the same size as the data frame of new observations.

Please note that I have greatly simplified this relative to my actual use case, so I need a robust general solution and not just ad-hoc hacking.

Paul
  • 3,321
  • 1
  • 33
  • 42

2 Answers2

3

eval(substitute the value into the quoted expression

mdl <- eval(substitute(lm(y~I(b*x),data=dat), list(b=b)))

mdl
# Call:
# lm(formula = y ~ I(1 * x), data = dat)
# ...
Rorschach
  • 31,301
  • 5
  • 78
  • 129
  • Wow, that's neat. Only downside is it makes `str(mdl)` more ugly because it's full of the substituted values. But I can live with that for now, unless you can think of an even better approach? Thanks! – Paul Aug 16 '15 at 23:39
  • @Paul I'm not sure what you mean? Because there are numbers in place of "b"? or something else? – Rorschach Aug 16 '15 at 23:42
  • yes, in practice I'm doing `cut(x,breaks=v)` and your solution is expanding `v` into `c(0,0.8,1.7,...)` wherever `v` is found in the `mdl` object. This is only an aesthetic issue though; at the end of the day, your solution does the job exactly as required. – Paul Aug 16 '15 at 23:44
  • 1
    @Paul ok, try `mdl <- local({b=b; lm(y ~ I(b*x), data=dat) })`, maybe that is prettier – Rorschach Aug 16 '15 at 23:46
  • Yes, that fixes the aesthetic issue. Could you possibly explain how is R still finding the original `b` value even after I change it in the global workspace? – Paul Aug 16 '15 at 23:49
  • 1
    @Paul you probably just want to read the `?local` documentation (the environment stuff can be confusing and it will do a better job than me), but `local` creates a new enviroment where you can define variables for the expression. For example, if you instead did `mdl <- local({ b=b; lm(y ~ I(b*x), data=dat) }, envir=.GlobalEnv)`, it wouldn't work when you redefine `b` in the global environment. – Rorschach Aug 16 '15 at 23:55
  • 1
    Thanks. After reading the docs I was a bit puzzled about where the new environment spawned by `local()` lived. But I found this code illuminating: `my_env=new.env(); mdl <- local({b<-1; lm(y ~ I(b*x), data=dat) },envir=my_env); ls(my_env)`, which outputs `"b"` – Paul Aug 17 '15 at 00:03
  • @Paul yes, that is a perfect example :) – Rorschach Aug 17 '15 at 00:05
0

We could also use bquote

mdl <- eval(bquote(lm(y~I(.(b)*x), data=dat)))
mdl

#Call:
#lm(formula = y ~ I(1 * x), data = dat)

#Coefficients:
#(Intercept)     I(1 * x)  
#  9.533e-15    3.000e+00  

According to ?bquote description

‘bquote’ quotes its argument except that terms wrapped in ‘.()’ are evaluated in the specified ‘where’ environment.

akrun
  • 874,273
  • 37
  • 540
  • 662