I have a data frame like below.
dat <- data.frame(v1=c("a","b","c","c","a","w","f"),
v2=c("z","a","a","w","p","e","h"))
v1 v2
1 a z
2 b a
3 c a
4 c w
5 a p
6 w e
7 f h
I want to add a group column based on whether these letters appear in the same row.
v1 v2 gp
1 a z 1
2 b a 1
3 c a 1
4 c w 1
5 a p 1
6 w e 1
7 f h 2
My idea is to first assign the first row to group 1, and then any row that v1 or v2 is "a" or "z" will also be assigned to group 1.
There are scenarios like row 3 and 4, where c is assigned to group 1, because, in row 3, v2 is "a". And "w" is assigned to group 1 because in row 4 v1 is "c", which is assigned to group 1 previously. But my list is very long, so I cannot keep checking all the "descendants".
I wonder if there is a way to group these letters, and return a list with group number. Something like the below table will do.
letter gp
a 1
b 1
c 1
e 1
f 2
h 2
w 1
z 1