I want to collapse the rows of dataframe to create the orthologe group of each othologe and its corresponding genes.
For example:
Column A | Column B |
---|---|
Ortho1 | gene1 |
Ortho2 | gene2, gene3 |
Ortho3 | gene4, gene5, gene6 |
Ortho4 | gene5, gene6 |
Ortho5 | gene6, gene7 |
Ortho6 | gene1, gene8 |
to be :
Column A | Column B |
---|---|
Ortho1, Ortho6 | gene1, gene8 |
Ortho2 | gene2, gene3 |
Ortho3, Ortho4, Ortho5 | gene4, gene5, gene6, gene7 |
I have tried to merge
them, however it requires id, which I do not provide by data. Also for
loop to find intersect()
. Feels like, there is a simpler way to overcome this bottleneck.
- the original data was like
Column A | Column B |
---|---|
Ortho1 | gene1 |
Ortho2 | gene2 |
Ortho2 | gene3 |
...