0

I got two Data Frames which I combine and they definitely have duplicates as shown later:

total_scrobbles = total_scrobbles.append(new_scrobbles)

After that the drop_duplicates Function doesnt do anything. Not a single row is deleted.

total_scrobbles.drop_duplicates(inplace = True)

But if I save the new DataFrame as a CSV, load it and use the same drop function again it is working:

total_scrobbles.to_csv('test.csv', index=False)
total_scrobbles = pd.read_csv('test.csv')
total_scrobbles.drop_duplicates(inplace = True)

Now all duplicates are deleted.

I mean, i found a solution. But can anybody tell me why this error occurs? In my head it doesn't make any sense. Is there a better solution than save and read_csv for nothing?

Thanks a lot.

thepic
  • 13
  • 2
  • Have you checked if both DataFrames are exactly the same? – boechat107 Mar 26 '20 at 17:32
  • 2
    My guess it has something to do the the data format youre using, but it is impossible to tell without the data. Provide ~10 problematic rows for which the problem reproduces! – LudvigH Mar 26 '20 at 17:39
  • Yes, some sample data would be good. You could also experiment with the `subset` argument to the `drop_duplicates()` method, to see which columns are causing the problem. – Arne Mar 26 '20 at 20:56

0 Answers0