I'm writing functions to automate a workflow for analyzing a lot of demographic data. I can get what I need from a regular pipe-stream of dplyr
functions, but I need to abstract this into NSE functions. I'm supplying a column name to a series of gather
calls via a ...
argument, but this only works with a single column; I need the option of using multiple columns. I'm having trouble with how to use quos(...)
in this case.
There's more to the function, but I'm including just enough to show the error.
Sample of data:
library(tidyverse)
race_pops <- structure(list(
town = c("Hamden", "Hamden", "Hamden", "Hamden","New Haven", "New Haven", "New Haven", "New Haven", "West Haven","West Haven", "West Haven", "West Haven"),
race = c("Total","White", "Black", "Latino", "Total", "White", "Black", "Latino","Total", "White", "Black", "Latino"),
est = c(61476, 37043, 13209,6450, 130405, 40164, 42970, 37231, 54972, 28864, 10677, 10977),
moe = c(31, 1039, 998, 879, 60, 1395, 1383, 1688, 42, 1226,1119, 1032),
region = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L,2L, 1L, 1L, 1L, 1L), .Label = c("Inner Ring", "New Haven"), class = "factor")),
class = c("tbl_df","tbl", "data.frame"), row.names = c(NA, -12L))
Here's a working bit that yields my desired output:
race_pops %>%
gather(key = measure, value = value, est, moe) %>%
unite("grp2", race, measure, sep = "_") %>%
spread(key = grp2, value = value) %>%
gather(key = grp2, value = value, -town, -region, -starts_with("Total")) %>%
head(10)
#> # A tibble: 10 x 6
#> town region Total_est Total_moe grp2 value
#> <chr> <fct> <dbl> <dbl> <chr> <dbl>
#> 1 Hamden Inner Ring 61476 31 Black_est 13209
#> 2 New Haven New Haven 130405 60 Black_est 42970
#> 3 West Haven Inner Ring 54972 42 Black_est 10677
#> 4 Hamden Inner Ring 61476 31 Black_moe 998
#> 5 New Haven New Haven 130405 60 Black_moe 1383
#> 6 West Haven Inner Ring 54972 42 Black_moe 1119
#> 7 Hamden Inner Ring 61476 31 Latino_est 6450
#> 8 New Haven New Haven 130405 60 Latino_est 37231
#> 9 West Haven Inner Ring 54972 42 Latino_est 10977
#> 10 Hamden Inner Ring 61476 31 Latino_moe 879
This is the function up to the point where I get the error:
gather_grp <- function(df, grp = group, value = est, moe = moe, ...) {
name_vars <- quos(...)
grp_var <- enquo(grp)
value_var <- enquo(value)
moe_var <- enquo(moe)
df %>%
gather(key = measure, value = value, -(!!!name_vars), -(!!grp_var)) %>%
unite("grp2", !!grp_var, measure, sep = "_") %>%
spread(key = grp2, value = value) %>%
gather(key = grp2, value = value, -(!!!name_vars), -starts_with("Total"))
}
The function works if I drop region
and use just the single column town
:
race_pops %>%
select(-region) %>%
gather_grp(grp = race, value = est, moe = moe, town) %>%
head(10)
#> # A tibble: 10 x 5
#> town Total_est Total_moe grp2 value
#> <chr> <dbl> <dbl> <chr> <dbl>
#> 1 Hamden 61476 31 Black_est 13209
#> 2 New Haven 130405 60 Black_est 42970
#> 3 West Haven 54972 42 Black_est 10677
#> 4 Hamden 61476 31 Black_moe 998
#> 5 New Haven 130405 60 Black_moe 1383
#> 6 West Haven 54972 42 Black_moe 1119
#> 7 Hamden 61476 31 Latino_est 6450
#> 8 New Haven 130405 60 Latino_est 37231
#> 9 West Haven 54972 42 Latino_est 10977
#> 10 Hamden 61476 31 Latino_moe 879
But I can't supply both town
and region
to the ...
:
race_pops %>%
gather_grp(grp = race, value = est, moe = moe, town, region)
#> Error in (~town): 2 arguments passed to '(' which requires 1
Created on 2018-05-08 by the reprex package (v0.2.0).
Thanks in advance!