I have a dataframe with 15 columns being used to calculate a score. Two columns (a & b) are my independent variables of which a & b both have duplicate values. Column C represents the score being calculated- of which i have sorted the dataframe by column C descending already. The goal is to keep the highest scored combination of a & b columns and drop any columns after.
Column A | Column B | Column C |
---|---|---|
5 | 10 | 1.5 |
5 | 12 | 1.4 |
10 | 12 | 1.0 |
7 | 14 | 0.9 |
7 | 9 | 0.8 |
12 | 6 | 0.7 |
14 | 4 | 0.6 |
In the above example, I would want the second column, third column, fifth column, sixth, and seventh columns all dropped. Sixth and seventh columns would be dropped because 12 and 14 were already included in rows above in columns b.