10

I've encounter a weird behavior when calling lm within a lapply using the weights argument.

My code consist of a list of formula on which I run a linear model that I call in lapply. So far it was working:

dd <- data.frame(y = rnorm(100),
                 x1 = rnorm(100),
                 x2 = rnorm(100),
                 x3 = rnorm(100),
                 x4 = rnorm(100),
                 wg = runif(100,1,100))

ls.form <- list(
  formula(y~x1+x2),
  formula(y~x3+x4),
  formula(y~x1|x2|x3),
  formula(y~x1+x2+x3+x4)
)

res.no.wg <- lapply(ls.form, lm, data = dd)

However, when I add the weights argument, I get a weird error:

res.with.wg <- lapply(ls.form, lm, data = dd, weights = dd[,"wg"])
Error in eval(extras, data, env) : 
  ..2 used in an incorrect context, no ... to look in

It's like if the ... from lapply was conflicting with the ... of the lm call but only because of the weights argument.

Any idea was is the cause of this problem and how to fix it?

NOTE: using the call without the lapply works as expected:

lm(ls.form[[1]], data = dd, weights = dd[,"wg"] )

Call:
lm(formula = ls.form[[1]], data = dd, weights = dd[, "wg"])

Coefficients:
(Intercept)           x1           x2  
   -0.12020      0.06049     -0.01937  

EDIT The final call is a lapply within a function of the type:

f1 <- function(samp, dat, wgt){
res.with.wg2 <- lapply(ls.form, function(x) {lm(formula = x, data=dat[samp,], weights=dat[samp,wgt])})
}

f1(1:66, dat=dd, wgt = "wg")
Bastien
  • 3,007
  • 20
  • 38
  • This seems to be a issue with using `lm` in functions with `weights` see: https://stackoverflow.com/questions/38683076/ellipsis-trouble-passing-to-lm – John Paul Dec 20 '17 at 14:09
  • I have reopened this even though the question has been asked before since the answer here is better than any of the answers to the original question. https://stackoverflow.com/questions/33479862/use-a-weights-argument-in-a-list-of-lm-lapply-calls – G. Grothendieck Dec 20 '17 at 15:22
  • @G.Grothendieck isn't it more valuable to have both questions linked together via duping? Maybe dupe the other one with this? – Sotos Dec 20 '17 at 15:25
  • OK. I have closed the other answer and linked it to this one. – G. Grothendieck Dec 20 '17 at 15:39
  • One other item. The checked answer is best at explaining this but the answer, which unfortunately was deleted by its author, seems to me to be the preferred workaround and it would be nice if the author of that answer reversed the deletion. – G. Grothendieck Dec 20 '17 at 15:43
  • @G. Grothendieck I have undeleted my answer, which I initally deleted because I felt the answer was close to others answers that are already on SO and James solution was better. sorry if I caused any confusion! – Florian Dec 20 '17 at 16:01

2 Answers2

11

I am not really sure why it is not working, but I do think I have a fix for you:

res.with.wg2 <- lapply(ls.form, 
                   function(x) {lm(formula = x, data=dd, weights=dd[,"wg"])})

Hope this helps!

Florian
  • 24,425
  • 4
  • 49
  • 80
  • You shouldn't have remove your comment on my question, the link you provided was useful – Bastien Dec 20 '17 at 14:20
  • Hi @Bastien, sorry for removing the comment. I removed it because I was not sure if that was the actual problem (there was a long comment thread below that answer), especially when I read @James' answer below. I am still trying to understand the problem, but I am afraid it goes a bit above my head. What really confuses me is the fact that `res.with.wg <- lapply(X=ls.form, FUN=lm, data = dd, method= 'model.frame')` does work properly. – Florian Dec 20 '17 at 14:22
  • For the curious, it is [this issue](https://stackoverflow.com/questions/33479862/use-a-weights-argument-in-a-list-of-lm-lapply-calls). It contains an answer almost exactly the same as mine... – Florian Dec 20 '17 at 14:33
  • Saddly, this solution doesn't work (yet) for me... as my problem is one level deeper... My lapply is in a function... making this example more accurate: `f1 <- function(samp, dat, wgt){ res.with.wg2 <- lapply(ls.form, function(x) {lm(formula = x, data=dat[samp,], weights=dat[samp,wgt])}) }` than calling with `f1(1:66, dat=dd, wgt = "wg")`. Still looking for a fix, but it seems like a deep problem! Should I modify my question or start a new one? – Bastien Dec 20 '17 at 14:38
  • @Bastien, for your problem you might be better off creating a base model, and then using `update` to create new ones with different formulae. – James Dec 20 '17 at 14:45
  • When I think of it, my problem is a `parLapply-function-lapply-function-lm` call... Maybe overly complicated but I was sure R could handle it. I'll look at the `update` function, it may indeed simplify things... – Bastien Dec 20 '17 at 14:51
  • @Bastien I've updated my answer with an example of how to use it – James Dec 20 '17 at 14:51
5

There is a note in the help file for lapply:

For historical reasons, the calls created by lapply are unevaluated, and code has been written (e.g., bquote) that relies on this. This means that the recorded call is always of the form FUN(X[[i]], ...), with i replaced by the current (integer or double) index. This is not normally a problem, but it can be if FUN uses sys.call or match.call or if it is a primitive function that makes use of the call. This means that it is often safer to call primitive functions with a wrapper, so that e.g. lapply(ll, function(x) is.numeric(x)) is required to ensure that method dispatch for is.numeric occurs correctly.

lm uses match.call twice in its opening lines:

cl <- match.call()
mf <- match.call(expand.dots = FALSE)

The solution noted in the help file and by @Florian is to use an anonymous function wrapper.

Update

For this specific problem of changing the model formula, you can rewrite to avoid calling lm within the lapply by using update instead:

# create base model (the formula here doesn't really matter, but we can specify the weights safely here)
baselm <- lm(y+x1,data=dd,weights=dd[,"wg"])
# update with lapply
lapply(ls.form,update,object=baselm)
[[1]]

Call:
lm(formula = y ~ x1 + x2, data = dd, weights = dd[, "wg"])

Coefficients:
(Intercept)           x1           x2  
    0.07561      0.16111      0.15014  

...
James
  • 65,548
  • 14
  • 155
  • 193
  • Interesting, I'll have to mix all my code around but I'll try it. I'll let you know how it turned out. – Bastien Dec 20 '17 at 14:54
  • Finally, your solution is good for this question, however it doesn't fix my problem (i think it just moved the problem somewhere else...) So I accepted your answer, but ask a follow up question there: https://stackoverflow.com/questions/47909470/calling-update-within-a-lapply-within-a-function-why-isnt-it-working – Bastien Dec 20 '17 at 15:31