0

I’m trying to write a function to pass some quosure arguments to an interior dplyr::select. However, I want to be able to apply some conditions to the arguments after they have been provided. In this particular case, because selecting columns that do not exist produces an error, I want the function to check whether the columns provided by the caller exist in the dataframe passed via the tib argument and remove any that do not before I pass the quosure and unquoting operator to select.

The problem is that once something is inside a quosure I don’t know how to manipulate it any more. I could convert the names to strings, eliminate the extra names, and then convert the vector of strings back to symbols with syms, which I think would work in this case because I am separately handing the interior select the dataframe in which I want them to be evaluated, but this basically strips off all the benefits of using a quosure and then artificially supplies them again, which seems roundabout and inelegant. I'd like to avoid a kludgy solution that works in this precise case but doesn't offer any useful principles for next time.

library(tidyverse)

tst_tb <- tibble(A = 1:10, B = 11:20, C=21:30, D = 31:40)

################################################################################
# strip() expects a non-anonymous dataframe, from which it removes the rows
# specified (as a logical vector) in remove_rows and the columns specified (as
# unquoted names) in remove_cols. It also prints a brief report; just the df
# name, length and width, and the remaining column names.
strip <- function(tib, remove_rows = FALSE, remove_cols = NULL){
  remove_rows <- enquo(remove_rows)
  remove_cols <- enquo(remove_cols)
  tib_name    <- deparse(substitute(tib))  
  out <- tib %>%
    filter(! (!! remove_rows))  %>%
    select(- !! remove_cols) %T>% (function(XX = .){
      print(paste0(tib_name,": Length = ", nrow(XX), "; Width = ", ncol(XX)))
      cat("\n")
      cat("     Names: ", names(XX))
    })
  out  
}

The next line will not work because of E in the remove_cols argument. You should think of E not as one name out of four or five, but as 10 or 20 arguments out of several hundred.

out_tb <- strip(tib = tst_df, remove_rows = (A < 3 | D > 36),  remove_cols = c(C, D, E))

out_tb

Desired output:

# A tibble: 4 x 2
      A     B
  <int> <int>
1     3    13
2     4    14
3     5    15
4     6    16
andrewH
  • 2,281
  • 2
  • 22
  • 32
  • It sounds like you are already on track with the right path despite the fact that it sounds "kludgy". When you turn `c(C,D,E)` into a closure, you're not storing a list of symbols, you're storing an unevaulated call to the concatenation function. You'd need to parse that function to extract the symbols. And the way R usually deals with unevaulated symbols is to refer to them by strings. – MrFlick May 06 '19 at 14:20
  • And when you have the strings you don't have to bother turning them into symbols again, because `select()` supports strings: https://stackoverflow.com/questions/49582602/how-not-to-select-columns-using-select-dplyr-when-you-have-character-vector-of/49582655 – MrFlick May 06 '19 at 14:20
  • In this scenario, what exactly are the benefits of using a quosure? – Alexis May 08 '19 at 05:52

0 Answers0