I am using dplyr v1.0.2 to manipulate tibbles. I would like to use group_by()
, using a function or a regular expression to specify the relevant variable names (the ...
argument). The only solution that I've found is clunky. Is there a relatively simple way?
Here is a minimal example that demonstrates the problem:
library(dplyr)
data(iris)
iris[, -(rbinom(1, 1, .5) + 1) ] %>% # randomly drop "Sepal.Length" or "Sepal.Width"
group_by(matches("^Sepal\\."))
In the third line, I randomly drop one of the two "Sepal" columns. In the last line, I want to group by the remaining "Sepal" column. The problem is that I don't know its name: it could be either "Sepal.Length" or "Sepal.Width." And the group_by()
command in the last line doesn't work: it predictably returns a matches() must be used within a *selecting* function
error message.
By contrast, this code works, but it is a bit clunky:
iris[, -(rbinom(1, 1, .5) + 1) ] %>%
group_by(!!as.name(grep('Sepal', colnames(.), val = TRUE)))
Is there a simpler way to do the grouping on the second line?