38

I have a variable with the same name as a column in a dataframe:

df <- data.frame(a=c(1,2,3), b=c(4,5,6))
b <- 5

I want to get the rows where df$b == b, but dplyr interprets this as df$b == df$b:

df %>% filter(b == b) # interpreted as df$b == df$b
#   a b
# 1 1 4
# 2 2 5
# 3 3 6

If I change the variable name, it works:

B <- 5
df %>% filter(b == B) # interpreted as df$b == B
#   a b
# 1 2 5

I'm wondering if there is a better way to tell filter that b refers to an outside variable.

nachocab
  • 13,328
  • 21
  • 91
  • 149
  • this might help you file:///Library/Frameworks/R.framework/Versions/3.2/Resources/library/dplyr/doc/nse.html – MLavoie Dec 11 '15 at 09:21
  • 1
    @MLavoie what is this? Better to provide [this link](https://cran.r-project.org/web/packages/dplyr/vignettes/nse.html). –  Dec 11 '15 at 09:30
  • @Pascal. There was similar question a few days ago and I don't remember where it is. But it looks like the environment is important here and this link explain how dplyr's verbs can be used in a similar context. but I might not have understood the question, so if it's the case disregard my comment :) – MLavoie Dec 11 '15 at 09:36
  • @MLavoie You misunderstand my comment. You provided a path to a local file, which only works for OSX users, not for Linux and Windows users. I simply provided the Internet version to the same file. –  Dec 11 '15 at 09:38
  • 3
    Typing `vignette("nse")` in the console is another option – talat Dec 11 '15 at 09:40
  • This was the recent question here on [filter and nse](https://stackoverflow.com/questions/46713002/). – David Klotz Oct 16 '17 at 13:10

4 Answers4

51

Recently I have found this to be an elegant solution to this problem, although I'm just starting to wrap my head around how it works.

df %>% filter(b == !!b)

which is syntactic sugar for

df %>% filter(b == UQ(b))

A high-level sense of this is that the UQ (un-quote) operation causes its contents to be evaluated before the filter operation, so that it's not evaluated within the data.frame.

This is described in this chapter of Advanced R, on 'quasi-quotation'. This chapter also includes a few solutions to similar problems related to non-standard evaluation (NSE).

jackinovik
  • 794
  • 1
  • 6
  • 6
  • I am confused by your caveat: I get now same result with `b == !!b` as `!!b == b`, while the latter is different from `!!(b == b)`. Maybe different versions ( I have dplyr 1 and rlang 0.4.7)? Can you confirm? Thanks! – Matifou Aug 13 '20 at 17:06
  • This is a good point, thank you. This behavior changed in later versions of rlang. I will edit my answer accordingly. – jackinovik Sep 17 '20 at 20:29
19

You could use the get function to fetch the value of the variable from the environment.

df %>% filter(b == get("b")) # Note the "" around b
nist
  • 1,706
  • 3
  • 16
  • 24
8

As a general solution, you can use the SE (standard evaluation) version of filter, which is filter_. In this case, things get a bit confusing because your are mixing a variable and an 'external' constant in a single expression. Here is how you do that with the interp function:

library(lazyeval)
df %>% filter_(interp(~ b == x, x = b))

If you would like to use more values in b you can write:

df %>% filter_(interp(~ b == x, .values = list(x = b)))
Axeman
  • 32,068
  • 8
  • 81
  • 94
8

rlang, which is imported with dplyr, has the .env and .data pronouns for exactly this situation when you need to be explicit because of data-masking. To explicitly reference columns in your data frame use .data and to explicitly reference your environment use .env:

library(dplyr)
df %>% 
  filter(.data$b == .env$b) # b == .env$b works the same here

  a b
1 2 5

From the documentation:

Note that .data is only a pronoun, it is not a real data frame. This means that you can't take its names or map a function over the contents of .data. Similarly, .env is not an actual R environment.

You do not necessarily need to use .data$b here because the evaluation searches the data frame for a column with that name first (as you found out).

LMc
  • 12,577
  • 3
  • 31
  • 43