This was harder than I thought it would be. I think the following works, but hopefully someone can come up with something prettier.
First we make some data to work with (please do this yourself in the future), which I include if anyone else wants a go:
library(tidyverse)
df1 <- tribble(
~state, ~group, ~species,
"CA", 2, "cat, dog, chicken, mouse",
"CA", 1, "cat",
"NV", 1, "dog, chicken",
"NV", 2, "chicken",
"WA", 1, "chicken, rat, mouse, lion",
"WA", 2, "dog, cat",
"WA", 3, "dog, chicken",
"WA", 4, "cat, chicken")
df2 <- tribble(
~state, ~special_species,
"CA", "cat",
"CA", "chicken",
"CA", "mouse",
"WA", "cat",
"WA", "chicken",
"NV", "dog")
Then the solution is:
df1 %>%
separate_rows(species) %>%
full_join(df2, on = "state") %>%
filter(species == special_species) %>%
group_by(state, group) %>%
summarise(species = paste(special_species, collapse = ", ")) %>%
full_join(df1, by = c("state" = "state", "group" = "group")) %>%
select(state, group, special_species = species.x) %>%
arrange(state)
#> Joining, by = "state"
#> # A tibble: 8 x 3
#> # Groups: state [3]
#> state group special_species
#> <chr> <dbl> <chr>
#> 1 CA 1 cat
#> 2 CA 2 cat, chicken, mouse
#> 3 NV 1 dog
#> 4 NV 2 <NA>
#> 5 WA 1 chicken
#> 6 WA 2 cat
#> 7 WA 3 chicken
#> 8 WA 4 cat, chicken
The code can be significantly simplified if you would accept the desired output in slightly different format, e.g. the following is correct save the NA
:
df1 %>%
separate_rows(species) %>%
full_join(df2, on = "state") %>%
filter(species == special_species) %>%
group_by(state, group) %>%
summarise(species = paste(special_species, collapse = ", "))
#> Joining, by = "state"
#> # A tibble: 7 x 3
#> # Groups: state [3]
#> state group species
#> <chr> <dbl> <chr>
#> 1 CA 1 cat
#> 2 CA 2 cat, chicken, mouse
#> 3 NV 1 dog
#> 4 WA 1 chicken
#> 5 WA 2 cat
#> 6 WA 3 chicken
#> 7 WA 4 cat, chicken
Created on 2019-12-03 by the reprex package (v0.3.0)