2

Using "mpg" data as an example, I wrote some code to call lsmeans function in the non-function format, and the output is good to me (as shown below).

I try to modify the codes into a function format to generate the same output, but the columns of the data could not be recognized in the function I wrote.

Errors reported from my function is:

Error in eval(predvars, data, env) : object 'cty' not found

My codes not in the function format that work fine:

library(rlang)
library(tidyverse)
library(dplyr)
library(multcompView)
library(lsmeans)


model = lm(cty ~ drv + class + drv:class,data=mpg)
anova(model)

marginal = lsmeans(model,~drv:class)

Pletters = multcomp::cld(marginal,
          alpha=0.05,
          Letters=letters,
          adjust="tukey")
Pletters$.group=gsub(" ", "", Pletters$.group)
Pletters

Function code I wrote that did not work for me:

library(rlang)
library(tidyverse)
library(dplyr)
library(multcompView)
library(lsmeans)

P_letters<-function(data, y, groupby, subgroupby){

model = lm(y ~ groupby + subgroupby + groupby:subgroupby,data=data)
anova(model)

marginal = lsmeans(model,~groupby:subgroupby)

Pletters = multcomp::cld(marginal,
          alpha=0.05,
          Letters=letters,
          adjust="tukey")
Pletters$.group=gsub(" ", "", Pletters$.group)
Pletters
}

Call the function with "mpg" data:

result<-mpg %>%  
P_letters(y=cty, groupby=drv, subgroupby=class)
result

Output from the non-function format codes:

nalysis of Variance Table

Response: cty
           Df  Sum Sq Mean Sq  F value Pr(>F)    
drv         2 1878.81  939.41 136.6198 <2e-16 ***
class       6  804.78  134.13  19.5069 <2e-16 ***
drv:class   3   10.26    3.42   0.4974 0.6844    
Residuals 222 1526.49    6.88                    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 drv class      lsmean    SE  df lower.CL upper.CL .group
 r   suv          12.0 0.791 222     9.58     14.4 a     
 4   pickup       13.0 0.456 222    11.60     14.4 a     
 4   suv          13.8 0.367 222    12.70     14.9 a     
 r   2seater      15.4 1.173 222    11.80     19.0 ab    
 f   minivan      15.8 0.791 222    13.39     18.2 ab    
 r   subcompact   15.9 0.874 222    13.21     18.6 ab    
 4   midsize      16.0 1.514 222    11.36     20.6 abc   
 4   compact      18.0 0.757 222    15.68     20.3 bc    
 f   midsize      19.0 0.425 222    17.67     20.3 bc    
 4   subcompact   19.5 1.311 222    15.48     23.5 bcd   
 f   compact      20.9 0.443 222    19.50     22.2 cd    
 f   subcompact   22.4 0.559 222    20.65     24.1 d     
 4   2seater    nonEst    NA  NA       NA       NA       
 f   2seater    nonEst    NA  NA       NA       NA       
 r   compact    nonEst    NA  NA       NA       NA       
 r   midsize    nonEst    NA  NA       NA       NA       
 4   minivan    nonEst    NA  NA       NA       NA       
 r   minivan    nonEst    NA  NA       NA       NA       
 f   pickup     nonEst    NA  NA       NA       NA       
 r   pickup     nonEst    NA  NA       NA       NA       
 f   suv        nonEst    NA  NA       NA       NA       

Confidence level used: 0.95 
Conf-level adjustment: sidak method for 21 estimates 
P value adjustment: tukey method for comparing a family of 21 estimates 
significance level used: alpha = 0.05 

Error from my function format code:

Error in eval(predvars, data, env) : object 'cty' not found
  • 1
    You need to construct the formula from a string. Try `as.formula("y ~ groupby + subgroupby + groupby:subgroupby")` and the same goes for the interaction in `lsmeans`. – Roman Luštrik Aug 28 '19 at 07:18
  • Hi, Roman Do you mean change the code as this? If so, it still results in the same error. Maybe i did not understand you correctly. Appreciate for further clarification. `model = lm(as.formula("y ~ groupby + subgroupby + groupby:subgroupby"),data=data) anova(model) marginal = lsmeans(as.formula("model,~groupby:subgroupby"))` – Meadowstrain Aug 29 '19 at 06:29
  • Or this does not work either: `model = lm(as.formula("y ~ groupby + subgroupby + groupby:subgroupby"),data=data) anova(model) marginal = lsmeans(model,as.formula("~groupby:subgroupby"))` – Meadowstrain Aug 29 '19 at 06:34
  • Lapsus, you need to paste together the formula, e.g. using `paste` or as I did, `sprintf`: `var1 <- "groupby"; var2 <- "subgroupby"; dep <- "y"; as.formula(sprintf("%s ~ %s + %s + %s:%s", dep, var1, var2, var1, var2))` – Roman Luštrik Aug 30 '19 at 18:02
  • `P_letters<-function(mydata, y, groupby, subgroupby){ var1 <- "groupby" var2 <- "subgroupby" dep <- "y" model = lm(as.formula(sprintf("%s ~ %s + %s + %s:%s", dep, var1, var2, var1, var2)),data=mydata) anova(model) marginal = lsmeans (as.formula(sprintf("%s ~ %s:%s", model, var1, var2))) Pletters = multcomp::cld(marginal, alpha=0.05, Letters=letters, adjust="tukey") Pletters$.group=gsub(" ", "", Pletters$.group) Pletters }` – Meadowstrain Sep 03 '19 at 05:53
  • Hi, Roman Above is how I modified my code according to your suggestion, based on my best knowledge. I am so new to R, must misunderstand or made something wrong, as this modified code still does not work. Please help me. Thanks! – Meadowstrain Sep 03 '19 at 05:56

1 Answers1

0

Some R functions operate directly on the expressions that a user writes, instead of evaluating them. It's a fairly advanced concept known as non-standard evaluation (NSE), and you can learn more about it in the recent tidy evaluation guide. As a very brief example of NSE, consider the library() function:

library("tidyverse")              # Works

a <- "tidyverse"
library( a )
# Error in library(a) : there is no package called ‘a’

If library() used standard evaluation rules, the expression a would have been evaluated to produce the value "tidyverse", which would then be passed to library(). However, because of NSE, the function works directly with a, instead of evaluating it.

The functions you are trying to use fall under the same umbrella and require special treatment to handle user-supplied expressions. The rlang package provides several mechanisms for this. In particular, we will use 1) ensym() to capture the variable name supplied to the function; 2) expr() to compose an unevaluated expression; 3) !! operator to force expression evaluation.

P_letters <- function(data, y, groupby, subgroupby) {
  ## Compose a formula expression using user-supplied variable names
  frml <- expr( !!ensym(y) ~ !!ensym(groupby) + !!ensym(subgroupby) +
                    !!ensym(groupby):!!ensym(subgroupby) )

  ## Pass the formula expression to lm()
  model <- lm(frml,data=data)
  print(anova(model))

  ## Compose and evaluate an expression for lsmean() call
  lsm_expr <- expr( lsmeans(model,~!!ensym(groupby):!!ensym(subgroupby)) )
  marginal <- eval(lsm_expr)

  Pletters <- multcomp::cld(marginal, alpha=0.05,
                          Letters=letters, adjust="tukey")
  Pletters$.group=gsub(" ", "", Pletters$.group)
  Pletters
}

## This now produces the desired output
mpg %>% P_letters( y=cty, groupby=drv, subgroupby=class ) 
Artem Sokolov
  • 13,196
  • 4
  • 43
  • 74
  • Thank you so much Artem! It is greatly helpful! I will read the rlang manual, it seems will be very helpful to me. – Meadowstrain Sep 13 '19 at 05:52
  • Hi, Artem, I am trying to pass some data in "Pletters" to another function, but realize when I was calling "Pletters", it cannot be found, reported as "Error: object 'Pletters' not found". – Meadowstrain Sep 16 '19 at 18:41
  • @Meadowstrain: `Pletters` is a local variable; it is not visible outside the function where it is defined. However, because it is returned by the function, you can simply assign the output of the function to another variable: `myvar <- mpg %>% P_letters( y=cty, groupby=drv, subgroup=class)`. – Artem Sokolov Sep 16 '19 at 22:48
  • Got it! Thank you so much Artem! – Meadowstrain Sep 18 '19 at 05:11
  • You're welcome, @Meadowstrain. Consider accepting this answer (by clicking the green checkmark), so the question would no longer appear in the "Unanswered" category. – Artem Sokolov Sep 20 '19 at 18:00