1

I'm trying to get all of the unique items in a data frame variable with comma separated strings:

I have this dataframe df = data.frame(v1 = c("A,S", "A,B,F", "A,B,C,D"))

And I want the outcome to be this:

A,B,C,D,F,S

A loop would work but I know there's an easier way.

Charles Stangor
  • 292
  • 7
  • 21

1 Answers1

2

We can split the column by ,, get the sorted unique elements from unlisting the list

sort(unique(unlist(strsplit(df$v1, ","))))
[1] "A" "B" "C" "D" "F" "S"

Or using tidyverse - split the column 'v1' at the delimiter, return the distinct rows and arrange the rows

library(dplyr)
library(tidyr)
df %>% 
   separate_rows(v1) %>% 
   distinct(v1) %>% 
   arrange(v1)

-output

# A tibble: 6 x 1
  v1   
  <chr>
1 A    
2 B    
3 C    
4 D    
5 F    
6 S
akrun
  • 874,273
  • 37
  • 540
  • 662