39

Why is dplyr::one_of() called that? All the other select_helpers names make sense to me, so I'm wondering if there's an aspect of one_of() that I don't understand.

My understanding of one_of() is that it just lets you select variables using a character vector of their names instead of putting their names into the select() call, but then you get all of the variables whose names are in the vector, not just one of them. Is that wrong, and if it's correct, where does the name one_of() come from?

user438383
  • 5,716
  • 8
  • 28
  • 43
MissMonicaE
  • 709
  • 1
  • 8
  • 15
  • 5
    I think the only persons that can answer to that are the developers of `dplyr`. Try e-mailing `maintainer("dplyr")`. – Rui Barradas Aug 24 '17 at 15:48
  • +1. Great question. Was looking for `one_of` to pass a character vector to functions in the `recipes` package only to ignore/overlook it because the name suggests that it returns only one... might have been perhaps more intuitive to call `one_of` something like `from_names`... – vathymut Jul 12 '18 at 17:09
  • 1
    Seems like it would be better named `is_one_of()`, to match the predicate-style naming of `starts_with()`, `contains()`, `matches()`, etc. – Ken Williams Jan 15 '20 at 19:17

3 Answers3

34

one_of allows for guessing or subset-matching

Let's say I know in general my column names will come from c("mpg","cyl","garbage") but I don't know which columns will be present because of interactivity/reactivity

mtcars %>% select(one_of(c("mpg","cyl","garbage")))

evaluates but provides a message

Warning message:
Unknown variables: `garbage`

In contrast

mtcars %>% select(mpg, cyl, garbage)

does not evaluate and gives the error

Error in overscope_eval_next(overscope, expr) : 
  object 'garbage' not found    
CPak
  • 13,260
  • 3
  • 30
  • 48
22

The way I think about it is that select() eventually evaluates to a logical vector. So if you use starts_with it goes through the variables in the dataframe and asks whether the variable name starts with the right set of characters. one_of does the same thing but asks whether the variable name is one of the names listed in the character vector. But as they say, naming things is hard!

Shorpy
  • 1,549
  • 13
  • 28
  • 2
    I think you nailed it. The `tidyverse` packages often make use of synonyms of names of base R functions, `one_of` should be thought as a synonym of `%in%` and we check if each colname is "one of" the given options, then everything makes sense. Still not a fan of the name though :). – moodymudskipper Aug 30 '18 at 20:01
5

The reason for its name seems to be that it allows you to look for, at least, one of the variables that are contained in the vector.

For example:

select(flights, dep, arr_delay, sched_dep_time) won't work because the variable "dep" does not exits. It will produce no result.

select(flights, one_of(c("dep", "arr_delay", "sched_dep_time"))) will work, even due the variable "dep" does not exist. In this case, "arr_delay" and "sched_dep_time" will be shown.

The helper should be read as: at least one_of() the variables will be shown :)

Paul OHF
  • 53
  • 1
  • 4