0

Here, is the data set which I'm working on Which looks like this. enter image description here

Basically, I want to delete duplicate rows specifically I know the drop_duplicate command but I need some help.

Let me show you by sorting the data so that It'll give you a clear understanding.

by_streamed=data.sort_values(by='Streams',ascending=False)
by_streamed

enter image description here

So when I get the top 10 streamed songs the duplicates obviously interfere. If you look closely though the ranks of these songs are different

I want to remove these type of duplicate rows. Here's my code,

data=data.drop_duplicates(subset=['Artist','Title'],keep='first')

enter image description here

But this removes a lot of rows that weren't supposed to be.

There is indeed an issue with subset but I can't interpret it. It would be great if you could help me figure it out. Thanks in advance.

kirti purohit
  • 401
  • 1
  • 4
  • 18
  • 1
    `But this removes a lot of columns that weren't supposed to be.` Can you explain more in some small data sample with 5 rows? – jezrael Mar 19 '21 at 08:42
  • do you mean `rows` and not `columns` ? – Umar.H Mar 19 '21 at 08:49
  • So, you want to remove the duplicates based on Artist, and Title, but not on any other columns? "But this removes a lot of columns that weren't supposed to be." This sentence in the question is misleading, cause you are dropping rows not columns. – ThePyGuy Mar 19 '21 at 08:57
  • I meant rows.. I have changed in the question – kirti purohit Mar 19 '21 at 09:05
  • Would you be able to help me please? @ThePyGuy – kirti purohit Mar 19 '21 at 09:47
  • Could you give some example rows that weren't supposed to be removed? – Ynjxsjmh Mar 19 '21 at 09:51
  • It's in the question itself. With the first code, the original songs are sorted correctly BUT WITH DUPLICATES. With this code `data=data.drop_duplicates(subset=['Artist','Title'],keep='first')` listings start directly from BTS and the upper ones got removed . If you notice streams of songs in the 2nd image are more than stream number (For eg. BTS) in the 3rd image – kirti purohit Mar 19 '21 at 09:53

0 Answers0