5

I am running a function on a long database (full_database) with two major groups where I need to perform various linear models on multiple subsets, per group.

Then, I extract the R^2, the adjusted R^2 and the p.value into a dataframe where each row corresponds to a single comparison. Since there are 30 different cases, I have another tibble which lists all possibilities (possibilities) where the arguments for the function lie.

The script for the original function is:

database_correlation <-  function(id, group) {

    require(dplyr)
    require(tidyr)
    require(rlang)

    id_name <- quo_name(id)
    id_var <- enquo(id)
    group_name <- quo_name(group)
    group_var <- enquo(group)

    corr_db <- full_database %>%
      filter(numid==!!id_name) %>%
      filter(major_group==!!group_name) %>%
      droplevels()

    correlation <- summary(lm(yvar~xvar, corr_db))

    id.x <- as.character(!!id_var) #Gives out an error: "invalid argument type"
    group.x <- as.character(!!group_var) #Gives out an error: "invalid argument type"
    r_squared <- correlation$r.squared
    r_squared_adj <- correlation$adj.r.squared
    p_value <- correlation$coefficients[2,4]

    data.frame(id.x, group.x, r_squared, r_squared_adj, p_value, stringsAsFactors=FALSE)
  }

I then run the function with:

correlation_all <- lapply(seq(nrow(possibilities)), function(index) {
    current <- possibilities[index,]
    with(current, database_correlation(id, database))
  }) %>%
    bind_rows()

I have commented the part where I get an error (id.x and group.x assignment) and I've tried multiple alternatives (I will use id.x as an example):

  1. id_var <- enquo(id) & id.x <- print(!!id_var)
  2. id_var <- sym(id) & id.x <- as.character(!!id_var)
  3. id_var <- sym(id) & id.x <- print(!!id_var)
  4. No id_var & id.x <- !!id_name
  5. No id_var & id.x <- id_name

The last option (in bold), works even though it has no unquotation and the same is true if I remove the bang bang (!!) when filtering the full_database, by using filter(numid==id_name) directly but I just can't understand why. By testing with TRUE and FALSE, R might be interpreting bang bang as double negation and, since it's expecting a boolean, it throws out an error.

Thank you for your help!

filcfig
  • 75
  • 3
  • 2
    First of all the `enquo()` should all be at the start of the function. Touching an argument in any way (with `quo_name()` in your case) prevents it to be defused properly. Secondly, `!!` can only be used in a quoted context. `as.character()` doesn't quote, so you shouldn't use it there. Finally, maybe you don't need tidy eval? It looks like you're working with simple values. Try removing all enquo / !! stuff. – Lionel Henry Nov 21 '19 at 12:39
  • Thanks for the feedback! I didn't know about the `enquo()` before running `quo_name()`, but it's good to know! I did try to run it without any tidy eval, but the R^2 values were the same between different groups, so it was not recognizing the group argument. However, following your and @smingerson 's comments I revised everything and the mistake was that, in my orginal script, both argument and group had the same name, so I had `group==!!group`. I've changed it to have different column names in the _full_database_ and in the _possibilities_ and it works now! – filcfig Nov 21 '19 at 17:05

1 Answers1

2

Use id and group directly -- I'm presuming these are character strings which were passed in, so I don't think there's a need to coerce the quosure to a string. Additionally, !! can be used inside functions which support tidy evaluation. A simple first step in determining this is "is the function from a base R package". as.character() is, so it doesn't work.

If you are determined to convert the quosure to a string, you can use rlang::as_name() to retrieve the corresponding symbol as a string. This is the recommended way of doing so.

By testing with TRUE and FALSE, R might be interpreting bang bang as double negation and, since it's expecting a boolean, it throws out an error.

Your supposition is correct.

The last option (in bold), works even though it has no unquotation and the same is true if I remove the bang bang (!!) when filtering the full_database, by using filter(numid==id_name)

Tidy-evaluation at it's heart is to evaluate symbols in the correct environment, or at least that's my take. This filter() works because it looks for the symbol id_name, does not find it in the data (the first place it looks), then looks in the enclosing environment, finds it, and evaluates the statement.

Imagine if you had a column named id_name within the data. How would you differentiate between the data's id_name and the one in the enclosing environment. Well, if you wanted the data's value, you could use .data$id_name (another rlang construct). If you want the value outside the data instead, use !!. This tells functions which support tidy evaluation to look at the quosure. The quosure identifies which environment it was defined in. Then it evaluates that symbol in that environment, ensuring no collision with a name in the data.

smingerson
  • 1,368
  • 9
  • 12
  • Thanks for the thorough reply! Your last paragraph showed exactly why running it with `id` and `group` wasn't yielding the correct results. I actually had `group` as a column name in both databases: the one with the data and the one with the arguments for the function. Your reply also clarified a lot on tidy eval, so thanks again! – filcfig Nov 21 '19 at 17:11