0

I am new to pandas. I have been trying to solve a problem here

This is the problem statement where I want to drop any row where I have a duplicate A but non duplicate B

Here is the kind of output I want

enter image description here

Krishan
  • 1
  • 1

2 Answers2

1

IIUC, this is what you need

a = (df['A'].ne(df['A'].shift())).ne((df['B'].ne(df['B'].shift())))
df[~a].reset_index(drop=True)

Output

    A   B
0   2   z
1   3   x
2   3   x
moys
  • 7,747
  • 2
  • 11
  • 42
  • this is not the expected output – ansev Sep 18 '19 at 12:43
  • @Krishan, can you clarify? based on your description, my code keeps the first row, however, your picture seems to keep the 2nd row. Can you clarify which is correct? – moys Sep 18 '19 at 12:51
1

I think you need:

cond=(df.eq(df.shift(-1))|df.eq(df.shift())).all(axis=1)
pd.concat([df[~cond].groupby('A').last().reset_index(),df[cond]])

    A   B
0   2   y
2   3   x
3   3   x
ansev
  • 30,322
  • 5
  • 17
  • 31