20

Having pandas data frame df with at least columns C1,C2,C3 how would you get all the unique C1,C2,C3 values as a new DataFrame?

in other words, similiar to :

SELECT C1,C2,C3
FROM T
GROUP BY C1,C2,C3

Tried that

print df.groupby(by=['C1','C2','C3'])

but im getting

<pandas.core.groupby.DataFrameGroupBy object at 0x000000000769A9E8>
Ofek Ron
  • 8,354
  • 13
  • 55
  • 103

1 Answers1

46

I believe you need drop_duplicates if want all unique triples:

df = df.drop_duplicates(subset=['C1','C2','C3'])

If want use groupby add first:

df = df.groupby(by=['C1','C2','C3'], as_index=False).first()
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252