1

I would like to find a shorter way of computing the consecutive difference between grouped vectors in a tibble (not A in B) where each difference is between a vector at group "x" and the concatenation of the vectors in all previous groups.

I have found a solution using anti_join within a for loop but I wonder if there is a more concise way.

library(magrittr)
library(tidyverse)

tibble(stage = rep(1:4, each = 2),
       string = c("a", "b", "a", "c", "b", "d", "f", "e")) %>%
  (function(df) {
    df_filtered <- df %>%
      filter(stage == 1)

    for (stage_sub in unique(df$stage)) {
      df_filtered %<>%
        rbind(
          df %>%
            filter(stage == stage_sub) %>%
            anti_join(., df %>%
                        filter(stage %in% 1:(stage_sub-1)), by = "string")
        )
    }

    df_filtered
  })

In other words, if:

group1: "a", "b"

group2: "a", "c"

group3: "b", "d"

when I compute the cumulative consecutive difference between group3 and group1:2, I should obtain:

group3: "d"

because "d" is the only element not included in all previous groups.

  • 2
    Is it cheating to just look at the strings and identify the group where each one first appears? `your_tibble_example %>% group_by(string) %>% filter(stage == min(stage))` – Jon Spring Jun 21 '19 at 00:31
  • 1
    isn't this just removing of duplicate values. `df[!duplicated(df$string), ]` ? – Ronak Shah Jun 21 '19 at 00:32
  • 1
    Thanks both! Ronak, I think your solution works but if we increase the number of grouping variables or the grouping variables are not arranged I find it difficult to implement. For example using `table <- tibble(set = rep(LETTERS[1:2], each=4), stage = c(1:4, 3, 4, 1, 2), string = c("a", "b", "a", "c", "b", "f", "f", "e"))` I have found another solution: `table %>% arrange(set, stage) %>% group_by(set, string) %>% distinct(string, .keep_all = TRUE)` The problem is that the table needs to be first arranged by the grouping variables. Jon's solution doesn't need that. Thanks very much! – Francesco Cabiddu Jun 21 '19 at 02:46

0 Answers0