I would like to find a shorter way of computing the consecutive difference between grouped vectors in a tibble (not A in B) where each difference is between a vector at group "x" and the concatenation of the vectors in all previous groups.
I have found a solution using anti_join within a for loop but I wonder if there is a more concise way.
library(magrittr)
library(tidyverse)
tibble(stage = rep(1:4, each = 2),
string = c("a", "b", "a", "c", "b", "d", "f", "e")) %>%
(function(df) {
df_filtered <- df %>%
filter(stage == 1)
for (stage_sub in unique(df$stage)) {
df_filtered %<>%
rbind(
df %>%
filter(stage == stage_sub) %>%
anti_join(., df %>%
filter(stage %in% 1:(stage_sub-1)), by = "string")
)
}
df_filtered
})
In other words, if:
group1: "a", "b"
group2: "a", "c"
group3: "b", "d"
when I compute the cumulative consecutive difference between group3 and group1:2, I should obtain:
group3: "d"
because "d" is the only element not included in all previous groups.