0

I have a list of problematic rows where there is a unique identifier, all of which I want to remove from a dataframe.

I've tried to use loc to index them, as follows:

df.loc[df['GUID'] != toDel['GUID']]

where df is 5063 row x 28 cols and toDel['GUID'] is a list of GUIDs that I want to remove from the df.

I expected this to give me a df that doesn't include the problematic rows. However, I get a 'valueError: Can only compare identically-labeled Series objects.' I guess this means they have to be identically sized Series, but then how do I get rid of the problematic GUIDs using this toDel['GUID'] list?

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
galal27
  • 51
  • 1
  • 8

1 Answers1

0

To keep only rows where GUIDis in toDel['GUID'], you can do this

df.loc[df['GUID'].isin(toDel['GUID'])]
fmarm
  • 4,209
  • 1
  • 17
  • 29