1

I'm trying to create a function that will take 2 variables from a dataset, and map their distinct values side by side, after which it will write the out to a csv file. I'll be using dplyr's distinct function for getting the unique values.

map_table <- function(df, var1, var2){
  df_distinct <- df %>% distinct(var1, var2)
  write.csv(df_distinct, 'var1.csv')
}

map_table(iris, Species, Petal.Width)

1) map_table(iris, Species, Petal.Width) doesn't produce what I want. It should produce 27 rows of data, instead I'm getting 150 rows of data.

2) How can I name the csv file after the input of var1? So if var1 = 'Sepal.Length', the name of the file should be 'Sepal.Length.csv'

smci
  • 32,567
  • 20
  • 113
  • 146
spidermarn
  • 959
  • 1
  • 10
  • 18
  • [non-standard evaluation (NSE)](https://stackoverflow.com/questions/tagged/non-standard-evaluation?sort=votes&pageSize=50) is one well-known hiccup when using `dplyr`. Here's [one related question from back in 2014](how can i tell select() in dplyr that the string it is seeing is a column name in a data frame); but the solution here is cleaner, so this should probably not be closed-as-duplicate. – smci Mar 09 '19 at 02:24

3 Answers3

2

If you want to pass the col names without quotes, you need to use non-standard evaluation. (More here)

deparse(substitute()) will get you the name for the file output.

library(dplyr)

map_table <- function(df, var1, var2){

  file_name <- paste0(deparse(substitute(var1)), ".csv") # file name

  var1 <- enquo(var1) # non-standard eval
  var2 <- enquo(var2) # equo() caputures the expression passed, ie: Species

  df_distinct <- df %>% 
    distinct(!!var1, !!var2) # non-standard eval, !! tells dplyr to use Species

  write.csv(df_distinct, file = file_name)

}

map_table(iris, Species, Petal.Width)
RLave
  • 8,144
  • 3
  • 21
  • 37
0

You're trying to pass the columns as objects. Try passing their names instead and then use a select helper:

map_table <- function(df, var1, var2){
  df_distinct <- df %>% select(one_of(c(var1, var2)))%>%
      distinct()
  write.csv(df_distinct, 'var1.csv')
}

map_table(iris, 'Species', 'Petal.Width')
Rohit
  • 1,967
  • 1
  • 12
  • 15
0

1) Ok the answer is to use distinct_ instead of distinct. And the variables being called need to be apostrophized. 2) use apply function to concatenate values/string formatting, and file =

map_table <- function(df, var1, var2){
  df_distinct <- df %>% distinct_(var1, var2)
  write.csv(df_distinct, file = paste(var1,'.csv'))
}

map_table(iris, 'Species', 'Petal.Width')
spidermarn
  • 959
  • 1
  • 10
  • 18