0

Same problem but it didn't help. How To Solve KeyError: u"None of [Index([..], dtype='object')] are in the [columns]" First try:

df = pd.read_csv('ABCD.csv', index_col=['A'])
df=df.drop_duplicates(['A'],['B'])

KeyError: Index(['Sample_ID'], dtype='object')

Here I have found out that it impossible to removed the index itself so I removed it from the top:

df = pd.read_csv('ABCD.csv')
df=df.drop_duplicates(['A'],['B'],keep = 'first')

TypeError: drop_duplicates() got multiple values for argument 'keep'

When I print df(type) it posts "DataFrame" , what could be the problem?

1 Answers1

1

I thought that would be

df=df.drop_duplicates(['A', 'B'],keep = 'first')

instead of:

df=df.drop_duplicates(['A'],['B'],keep = 'first')

The subset must be a list of columns, not separate to multiple arguments: subsetcolumn label or sequence of labels, optional doc

PS: You should use df.drop_duplicates(['A', 'B'], keep='first', inplace=True), you dont need to assign back to df when adding inplace

Binh
  • 1,143
  • 6
  • 8