0

Say I have a list a which is defined as:

a <- list("aaa;bbb", "aaa", "bbb", "aaa;ccc")

I want to split this list by semicolon ;, get only unique values, and return another list. So far I have split the list using str_split():

a <- str_split(a, ";")

which gives me

> a
[[1]]
[1] "aaa" "bbb"

[[2]]
[1] "aaa"

[[3]]
[1] "bbb"

[[4]]
[1] "aaa" "ccc"

How can I manipulate this list (using unique()?) to give me something like

[[1]]
[1] "aaa" 

[[2]]
[1] "bbb"

[[3]]
[1] "ccc"

or more simply,

[[1]]
[1] "aaa" "bbb" "ccc"
Sotos
  • 51,121
  • 6
  • 32
  • 66
Kyle Weise
  • 869
  • 1
  • 8
  • 29

2 Answers2

3

One option is to use list() with unique() and unlist() inside your list.

    # So first you use your code
    a <- list("aaa;bbb", "aaa", "bbb", "aaa;ccc")
    # Load required library 
    library(stringr) # load str_split
    a <- str_split(a, ";")
    # Finally use list() with unique() and unlist()
    list(unique(unlist(a)))
    # And the otuput
    [[1]]
    [1] "aaa" "bbb" "ccc"
Miha
  • 2,559
  • 2
  • 19
  • 34
  • 3
    you could use `strsplit(unlist(a))`, which works with `strsplit` from base R (no real need for `library(stringr)` – Ben Bolker Jul 24 '17 at 13:15
  • @Miha we have an "Rkoholiki" slack going on. Drop me a line at romunov located at gmail if you would like to join us. This message will self destruct in XY minutes. :) – Roman Luštrik Jul 24 '17 at 14:18
3

One alternative in base R is to use rapply which applies a function to each of the inner most elements in a nested list and returns the most simplified object possible by default. Here, it returns a vector of characters.

unique(rapply(a, strsplit, split=";"))
[1] "aaa" "bbb" "ccc"

To return a list, wrap the output in list

list(unique(rapply(a, strsplit, split=";")))
[[1]]
[1] "aaa" "bbb" "ccc"
lmo
  • 37,904
  • 9
  • 56
  • 69