0

I want to filter through a dataframe in python of 3 columns. I want only the rows for which the first two columns are the same but not the third one. i.e.

  1. A B C

  2. 1 4 2

  3. 1 5 3

  4. 2 3 3

  5. 3 1 1

  6. 4 3 2

  7. 2 3 5

On the above example, I would like to get only rows 3 and 6 since the first two columns match.

1 Answers1

1

Use duplicated and boolean indexing:

out = df[df[['A', 'B']].duplicated(keep=False)]

output:

   A  B  C
2  2  3  3
5  2  3  5
mozway
  • 194,879
  • 13
  • 39
  • 75