Find all unique values in column separated by comma

Question

I have multiple observations of one species with different observers / groups of observers and want to create a list of all unique observers. My data look like this:

data <- read.table(text="species observer
1 A,B
1 A,B
1 B,E
1 B,E
1 D,E,A,C,C
1 F"               , header = TRUE, stringsAsFactors = FALSE)

My output should return a list of all unique observers - so:

A,B,C,E,F

I tried to substring the data in column C using the following command but that only returns the unique combinations of observers.

all_observers <- unique(strsplit(as.character(data$observer), ","))

all_observers
[[1]]
[1] "A" "B"

[[2]]
[1] "B" "E"

[[3]]
[1] "D" "E" "A" "C" "C"

[[4]]
[1] "F"

You need to `unlist` before your do `unique`. Try `unique(unlist(strsplit(...)))` — Gregor Thomas, Jan 07 '19 at 16:43

score 5 · Accepted Answer · answered Jan 07 '19 at 17:09

5

You're almost there, you just need to unlist before you do the unique:

all_observers <- unique(unlist(strsplit(as.character(data$observer), ",")))

answered Jan 07 '19 at 17:09

Gregor Thomas

136,190
20
167
294

score 2 · Answer 2 · answered Jan 07 '19 at 16:44

We can use separate_rows on the 'observer', get the distinct rows, grouped by 'species', and paste the 'observer'

library(tidyverse)
data %>% 
   separate_rows(observer) %>% 
   distinct %>% 
   group_by(species) %>% 
   summarise(observer = toString(observer))

score 2 · Answer 3 · answered Jan 07 '19 at 17:13

2

You could also use scan()

unique(scan(text=data$observer, what="", sep=","))
# Read 14 items
# [1] "A" "B" "E" "D" "C" "F"

answered Jan 07 '19 at 17:13

Rich Scriven

97,041
11
181
245

Find all unique values in column separated by comma

3 Answers3

Linked

Related