I have the following Series
which is the result of using Stack
on a DataFrame
to result in the desired output:
col1 col2
A GS 0.522696
F GS 0.422812
GS A 0.522696
F 0.422812
In the above example, the rows (A,GS) = 0.522696
and (GS,A) = 0.522696
are considered to be the same so I need to filter out one of them. The same goes for (F,GS) = 0.422812
and (GS,F) = 0.422812
.
Essentially what is happening is that every row will be duplicated in the sense that col1 and col2 will be reversed, but the corresponding float value is the same. (ie: GS,F is a duplicate of F,GS). I therefore need to filter out the 'duplicate'. It doesn't matter which one gets filtered out, I just need the result of the above example to only include two rows.
I've tried to change the structure into a dict just to see if it will be easier to work with, ie: Series.to_dict()
, which results in:
{('GS', 'F'): 0.422812, ('A', 'GS'): 0.522696,
('F', 'GS'): 0.422812, ('GS', 'A'): 0.522696}
But I still haven't had any luck, regardless of it is in a series or dict.