3

I have a data frame with 3 columns A, B, C and I'd like to build a function to only keep rows where column A is lower to another column (could be column B or C)

I know we need to use filter_ and SE to make this possible with dplyr and I had a look at the vignette but I don't understand how it works.'

How could I transform this function into a SE function?

df = data.frame(columnA = 1:100,
                columnB = rnorm(100, 50, 10),
                columnC = rnorm(100, 50, 10))

fct = function(df,column_name){
  df2 = df %>% filter(columnA < column)
return(df2)
}
  • @psql, what is `NSE` function? – Marta Jan 28 '16 at 11:51
  • Non-standard evaluation https://cran.r-project.org/web/packages/dplyr/vignettes/nse.html –  Jan 28 '16 at 11:59
  • Maybe [this](http://stackoverflow.com/questions/34922586/r-make-function-robust-to-both-standard-and-non-standard-evaluation) Q/A can help you. You will probably need `filter_` – David Arenburg Jan 28 '16 at 12:01
  • Just to make sure you're clear on the nomenclature, the function `fct` is using NSE which is the `dplyr` default. The function with "_" on the end (`filter_`) is the one which uses SE. – NGaffney Jan 28 '16 at 12:13
  • @NGaggney: oh yes, I made a mistake, sorry –  Jan 28 '16 at 12:18

3 Answers3

1

Transforming your expression inside filter_ into a string is one way to do it:

fct = function(df, column_name){
  df2 = df %>% filter_(paste("columnA <", column_name))
  return(df2)
}
nrow(fct(df, "columnB"))
## [1] 50
NGaffney
  • 1,542
  • 1
  • 15
  • 16
0

NGaffney's answer is the SE version. Here's the NSE version, meaning it allows you to enter an unquoted column name:

require(dplyr)
df = data.frame(columnA=20, columnB=50, columnC=15)

fct = function(df,column_NSE){
  column_name = deparse(substitute(column_NSE))
  df2 = df %>% filter_(paste("columnA < ", column_name))
  return(df2)
}

Test run:

> fct(df,columnB)
  columnA columnB columnC
1      20      50      15

> fct(df,columnC)
[1] columnA columnB columnC
<0 rows> (or 0-length row.names)
Paul
  • 3,321
  • 1
  • 33
  • 42
0

Here's a function that works with character input/SE.

fct = function(df, column_name){
  #convert to sym from chr
  column_name = sym(column_name)

  #filter
  df %>% filter(columnA < column_name)
}

Test:

> df %>% fct("columnB") %>% head()
  columnA  columnB  columnC
1       1 68.80929 56.49032
2       2 58.17927 68.06920
3       3 57.52833 66.00263
4       4 41.38442 57.58875
5       5 38.93989 61.93183
6       6 51.10835 54.70835

I am not sure why one has to do the sym() call first.

CoderGuy123
  • 6,219
  • 5
  • 59
  • 89