I have a pandas dataframe:
df2 = pd.DataFrame({'c':[1,1,1,2,2,2,2,3],
'type':['m','n','o','m','m','n','n', 'p']})
And I would like to find which values of c
have more than one unique type and for those return the c
value, the number of unique types and all the unique types concatenated in one string.
I have used those two questions to get so far:
pandas add column to groupby dataframe Python Pandas: concatenate rows with unique values
df2['Unique counts'] = df2.groupby('c')['type'].transform('nunique')
df2[df2['Unique counts'] > 1].groupby(['c', 'Unique counts']).\
agg(lambda x: '-'.join(x))
Out[226]:
type
c Unique counts
1 3 m-n-o
2 2 m-m-n-n
This works but I cannot get the unique values (so for example in the second row I would like to have only one m
and one n
.
My questions would be the following:
- Can I skip the in between step for creating the 'Unique counts' and create something temporary?
- How can I filter for only unique values in the second step?