I would like to use something similar to dropping the duplicates of a DataFrame. I would like columns' order not to matter. What I mean is that the function shuold consider a row consisting of the entries 'a', 'b'
to be identical to a row consisting of the entries 'b', 'a'
. For example, given
df = pd.DataFrame([['a', 'b'], ['c', 'd'], ['a', 'b'], ['b', 'a']])
0 1
0 a b
1 c d
2 a b
3 b a
I would like to obtain:
0 1
0 a b
1 c d
where the preference is for efficiency, as I run this on a huge dataset within a groupby operation.