I have a dataframe that has many rows per combination of the 'PROGRAM', 'VERSION' and 'RELEASE_DATE' columns. I want to get a dataframe with all of the combinations of just those three columns. Would this be a job for groupby
or distinct
?
thx
I have a dataframe that has many rows per combination of the 'PROGRAM', 'VERSION' and 'RELEASE_DATE' columns. I want to get a dataframe with all of the combinations of just those three columns. Would this be a job for groupby
or distinct
?
thx
Since you are not aggregating anything, use unique
df.select(['PROGRAM','VERSION','RELEASE_DATE']).unique()
If you are not using the Lazy functionality of Polars, this can also be written as:
df[['PROGRAM','VERSION','RELEASE_DATE']].unique()