If you have a groupby object based on a dask dataframe why does len(<groupby object>)
return an error? (bug or feature)
Asked
Active
Viewed 308 times
5

Back2Basics
- 7,406
- 2
- 32
- 45
1 Answers
3
This just hasn't been implemented. You might want to raise an issue (or better yet, a pull request). Pragmatically I would just call nunique
on your grouping object
Before
g = df.groupby(df.x + df.y)
result = len(g)
After
result = (df.x + df.y).nunique()
Operationally this is nicer because it can be lazy (the result of len
in Python must be a concrete integer) and because you can choose the nunique_approx
variant, which will be far faster.

MRocklin
- 55,641
- 23
- 163
- 235