I want to write an R function using dplyr to summarise a data set that accepts different numbers of grouping variables to the group_by statement - including no grouping at all. I have found answers to similar questions that use 'group_by_', but this has been deprecated (dplyr vrsion at time of writing is 1.1.2).
I have used different methods of passing vectors to the group_by statements attempting to use tidy evaluation, but none have worked as expected and failed to return an answer when no grouping is required.
Here's the basis for a reproduceable example using the starwars dataset. The function should be capable of returning summary tables of the Body-Mass Indexes (BMI) of the various creatures.
`star_wars_BMI <- function(group_vec) {
df_out <- starwars %>%
mutate (BMI = height/mass^2) %>%
group_by(group_vec) %>%
summarise(height_mean = mean(height, na.rm = T),
mass_mean = mean(mass, na.rm = T),
BMI_mean = mean(BMI, na.rm = T))
return(df_out)
}
group_vector0 <- c() # ie. summarise for the whole galaxy
group_vector1 <- c("homeworld") # summarise by homeworld planet
group_vector2 <- c("homeworld", "species") = summarise by species on each homeworld
galaxy_BMI <- star_wars_BMI(group_vec = group_vector0)
homeworld_BMI <- star_wars_BMI(group_vec = group_vector1)
`
I know it's a relatively simple task to produce separate functions for either no or some groups, but I would like to see if it is possible to do this with just one.
An explanation of the tidy evalation rationale would be very much appreciated - as would an example that went on to plot the summaries.