22
library(dplyr)

Toy dataset:

df <- data.frame(x = c(1, 2, 3), y = c(4, 5, 6))
df
  x y
1 1 4
2 2 5
3 3 6

This works fine:

df %>% filter(y == 5)
  x y
1 2 5

This also works fine:

z <- 5
df %>% filter(y == z)
  x y
1 2 5

But this fails

y <- 5
df %>% filter(y == y)
  x y
1 1 4
2 2 5
3 3 6

Apparently, dplyr cannot make the distinction between its column y and the global variable y. Is there a way to tell dplyr that the second y is the global variable?

zx8754
  • 52,746
  • 12
  • 114
  • 209
Marco
  • 9,334
  • 7
  • 33
  • 51

2 Answers2

19

You can do:

df %>% filter(y == .GlobalEnv$y)

or:

df %>% filter(y == .GlobalEnv[["y"]])

or:

both of which work in this context, but won't if all this is going on inside a function. But get will:

df %>% filter(y == get("y"))
f = function(df, y){df %>% filter(y==get("y"))}

So use get.

Or just use df[df$y==y,] instead of dplyr.

Spacedman
  • 92,590
  • 12
  • 140
  • 224
8

The global environment can be accessed via the .GlobalEnv object:

> filter(df, y==.GlobalEnv$y)
  x y
1 2 5

Interestingly, using the accessor function globalenv() as a substitute for .GlobalEnv doesn't work in this scenario.

Hong Ooi
  • 56,353
  • 13
  • 134
  • 187
  • 4
    `globalenv()` fails because any function call with a `$y` on it will fail, because `dplyr` seems to do some bad things to expressions like `foo()$y` if there's a `y` in your data frame. Horrible. – Spacedman Oct 21 '16 at 07:16
  • 2
    `globalenv()[["y"]]` on the other hand does work. It's always the `$` that gives problems. – Axeman Oct 21 '16 at 07:33
  • 1
    Its the non-standard evaluation of `dplyr` that gives problems. Recall that without `dplyr` the code is simply `df[df$y==y,]`. – Spacedman Oct 21 '16 at 07:51
  • 2
    @Spacedman NSE doesn't have anything to do with it, save indirectly. `subset(df, y==globalenv()$y)` works, for example. This is just a good old-fashioned bug in dplyr. – Hong Ooi Oct 21 '16 at 08:56