3

I have a dataframe that has many rows per combination of the 'PROGRAM', 'VERSION' and 'RELEASE_DATE' columns. I want to get a dataframe with all of the combinations of just those three columns. Would this be a job for groupby or distinct?

thx

rchitect-of-info
  • 1,150
  • 1
  • 11
  • 23

1 Answers1

6

Since you are not aggregating anything, use unique

df.select(['PROGRAM','VERSION','RELEASE_DATE']).unique()

If you are not using the Lazy functionality of Polars, this can also be written as:

df[['PROGRAM','VERSION','RELEASE_DATE']].unique()
Anton Daneyko
  • 6,528
  • 5
  • 31
  • 59