2

I am trying to create a function that allows me to pass outcome and predictor variable names as strings into the lm() regression function. I have actually asked this before here, but I learned a new technique here and would like to try and apply the same idea in this new format.

Here is the process

library(tidyverse)

# toy data
df <- tibble(f1 = factor(rep(letters[1:3],5)),
             c1 = rnorm(15),
             out1 = rnorm(15))

# pass the relevant inputs into new objects like in a function 
d <- df
outcome <- "out1"
predictors <- c("f1", "c1")

# now create the model formula to be entered into the model
form <- as.formula(
    paste(outcome,
          paste(predictors, collapse = " + "),
          sep = " ~ "))


# now pass the formula into the model
model <- eval(bquote( lm(.(form), 
                         data = d) ))

model

# Call:
#   lm(formula = out1 ~ f1 + c1, data = d)
# 
# Coefficients:
#   (Intercept)          f1b          f1c           c1  
#       0.16304     -0.01790     -0.32620     -0.07239 

So this all works nicely, an adaptable way of passing variables into lm(). But what if we want to apply special contrast coding to the factorial variable? I tried

model <- eval(bquote( lm(.(form), 
                         data = d,
                         contrasts = list(predictors[1] = contr.treatment(3)) %>% setNames(predictors[1])) ))

But got this error

Error: unexpected '=' in:
"                                                 data = d,
                                                 contrasts = list(predictors[1] ="

Any help much appreciated.

llewmills
  • 2,959
  • 3
  • 31
  • 58
  • 1
    So what should the final call look like? Maybe you want `eval(bquote( lm(.(form),data = d, contrasts = list(contr.treatment(3)) %>% setNames(predictors[1])) ))`? You are already using `setNames` so you need to try to name it in the list itself. – MrFlick Aug 07 '20 at 03:29
  • 1
    First, I don't think you need `eval` + `bquote` etc. Why not `model <- lm(form, d)` ? Gives the same result. – Ronak Shah Aug 07 '20 at 03:30
  • @Ronak Shah I think the motivation for the `eval` and `bquote` is so the original formula elements are printed out in the output instead of just `form`. – llewmills Aug 07 '20 at 05:42
  • Thanks @MrFilck, that got me there. I noticed you didn't supply the left hand side of the `contrasts` argument specifying which predictor to apply the contrasts to. If there were more than one factor and they had different number of levels I would have to do that. Is there a way to pass the name of the factor to which the contrast is being applied in a similar way to how we did it in the formula section? – llewmills Aug 07 '20 at 06:04
  • @llewmills What do you think `setNames` does? Maybe try `contr <- list(contr.treatment(3)) %>% setNames(predictors[1]); model <- eval(bquote( lm(.(form), data = d, contrasts = contr) ))` and inspect `contr`. – Roland Aug 07 '20 at 07:42
  • 1
    Also, please keep your examples minimal. There is no need for that huge meta package here (I simply refuse to install it). If you must use a tibble and pipes, `library(tibble); library(magrittr)` is sufficient. – Roland Aug 07 '20 at 07:45

1 Answers1

1

Reducing this to the command generating the error:

list(predictors[1] = contr.treatment(3))

Results in:

Error: unexpected '=' in "list(predictors[1] ="

list() seems to choke when the left-hand side naming is a variable that needs to be evaluated.

Your approach of using setNames() works, but needs to be wrapped around the list construction step itself.

setNames(list(contr.treatment(3)), predictors[1])

Output is a named list containing a contrast matrix:

$f1
  2 3
1 0 0
2 1 0
3 0 1
GobiJerboa
  • 175
  • 2
  • 5
  • Brilliant @GobiJerboa. These posts really are like sending a bunch of messages in bottles: you never know when an answer will come in. – llewmills Sep 25 '22 at 00:09