I have a dataset I'm working on, and one of the columns contains multiple features that are separated by a comma. The number of features in each observation varies.
df <- data.frame(x=c("a", "a,b,c", "a,c", "b,c", "", "b"))
x
1 a
2 a,b,c
3 a,c
4 b,c
5
6 b
I want to split this into multiple logical columns like this:
a b c
1 1 0 0
2 1 1 1
3 1 0 1
4 0 1 1
5 0 0 0
6 0 1 0
where each column would represent if the observation contained that string in the original column. How can this be achieved? Is there a way to do it without specifying the output columns? For example, what if an observation contains:
"a,b,d"
How can I do it in a way that captures all unique features of the original column?