2

Given the following tibble

tibble(sample = c(1:6),
       string = c("ABC","ABC","CBA","FED","DEF","DEF"),
       x = c("a","a","b","e","d","d"),
       y = c("b","b","a","d","e","e"))

# A tibble: 6 × 4
  sample string x     y    
   <int> <chr>  <chr> <chr>
1      1 ABC    a     b    
2      2 ABC    a     b    
3      3 CBA    b     a    
4      4 FED    e     d    
5      5 DEF    d     e    
6      6 DEF    d     e  

I would like to group rows by the unordered combination of columns x,y, then flip xy and reverse string in instances where x,y is inverted relative to the first row in the group. The desired output:

# A tibble: 6 × 5
  sample string x     y     group
   <int> <chr>  <chr> <chr> <dbl>
1      1 ABC    a     b         1
2      2 ABC    a     b         1
3      3 ABC    a     b         1
4      4 FED    e     d         2
5      5 FED    e     d         2
6      6 FED    e     d         2
acvill
  • 395
  • 7
  • 15

1 Answers1

2
strSort <- function(x) sapply(lapply(strsplit(x, NULL), sort), paste, collapse="")

dat %>% 
  group_by(group = data.table::rleid(strSort(string))) %>% 
  mutate(across(string:y, first))

# A tibble: 6 x 5
# Groups:   group [2]
  sample string x     y     group
   <int> <chr>  <chr> <chr> <int>
1      1 ABC    a     b         1
2      2 ABC    a     b         1
3      3 ABC    a     b         1
4      4 FED    e     d         2
5      5 FED    e     d         2
6      6 FED    e     d         2

Previous answer

Here's a way using both tidyverse and apply approaches. First, sort the rows of columns x and y, then group_by x and y, create a cur_group_id and stri_reverse if necessary.

library(tidyverse)
library(stringi)

#Sort by row
dat[, c("x", "y")] <- t(apply(dat[, c("x", "y")], 1, sort))

dat %>% 
  group_by(x, y) %>% 
  mutate(group = cur_group_id(),
         string = ifelse(str_sub(string, 1, 1) == toupper(x), string, stri_reverse(string)))

# A tibble: 6 x 5
# Groups:   x, y [2]
  sample string x     y     group
   <int> <chr>  <chr> <chr> <int>
1      1 ABC    a     b         1
2      2 ABC    a     b         1
3      3 ABC    a     b         1
4      4 DEF    d     e         2
5      5 DEF    d     e         2
6      6 DEF    d     e         2
Maël
  • 45,206
  • 3
  • 29
  • 67
  • 2
    This is very helpful and answers most of my question, but I'd like to preserve the order of the first row in each group. Note row 4 in my input and rows 4-6 in my desired output. – acvill Apr 07 '22 at 15:45
  • @acvill see my edit. – Maël Apr 07 '22 at 16:30