I have a pandas df of different permutations of values: (toy version below, but my actual df contains more columns and rows)
My goal is to remove the rows that contain duplicate values across rows but critically with also checking all columns.
import itertools
check = list(itertools.permutations([1, 2, 3]))
test = pd.DataFrame(check, columns =['A', 'B', 'C'])
index A B C
0 1 2 3
1 1 3 2
2 2 1 3
3 2 3 1
4 3 1 2
5 3 2 1
Desired output:
index A B C
0 1 2 3
3 2 3 1
4 3 1 2
For example, I want to drop row 1
because both it and row 0
contain a 1 in the A column. I also want to drop row 2
because it and row 0
contain a 3 in the C column. And I want to drop row 5
because it and row 4
contain a 3 in the A column and because it and row 0
contain a 2 in the B column.
In other words, I am trying to generate a dataframe that contains unique combinations. Not permutations.