drop_duplicates() stopped working in Python pandas

Question

This code had previously worked in python 3 to remove the duplicate values but keep first occurrence across an entire dataframe. After coming back to my script this no longer removes duplicates in a pandas dataFrame.

df = df.apply(lambda x: x.drop_duplicates(), axis=1)

so if I have

I want to get as an output

I don't mind if the blanks return as 'nan'

I also tried the following

df.drop_duplicates(subset = None, keep='first')

and

df.drop_duplicates(subset = None, keep='first', inplace =True)

Any advice / alternatives would be welcome!

First occurrence traversing row-wise or column-wise? – IanS Nov 27 '18 at 15:10 — IanS, Nov 27 '18 at 15:10

BENY · Accepted Answer · 2018-11-27T14:54:10.510

3

After your attached the data , I think you can using duplicated

newdf=df[~df.stack().duplicated().unstack()]
newdf
Out[131]: 
      a    b     c
0   0.0  1.0   2.0
1   3.0  4.0   NaN
2   NaN  8.0   9.0
3  10.0  NaN  11.0

edited Nov 27 '18 at 14:54

answered Nov 27 '18 at 14:47

BENY

317,841
20
164
234

this worked great with the data converted to int64 - thank you! – C_psy Nov 27 '18 at 15:22

score 0 · Answer 2 · answered Nov 27 '18 at 14:26

0

You need inplace to be True:

df.drop_duplicates(subset=None, keep='first', inplace=True)

answered Nov 27 '18 at 14:26

Toby Petty

4,431
1
17
29

aah yes, I had tried True to update that frame too - edited Question – C_psy Nov 27 '18 at 14:27

drop_duplicates() stopped working in Python pandas

2 Answers2