Is there a simple way to manually iterate through existing pandas groupby objects?

Question

Is there a simple way to manually iterate through existing pandas groupby objects?

import pandas as pd

df = pd.DataFrame({'x': [0, 1, 2, 3, 4], 'category': ['A', 'A', 'B', 'B', 'B']})
grouped = df.groupby('category')

In the application a for name, group in grouped: loops follows. For manual-testing I would like to do something like group = grouped[0] and run the code within the for-loop. Unfortunately this does not work. The best thing I could find (here) was

group = df[grouped.ngroup()==0]

which relies on the original DataFrame and not soley on the groupby-Object and is therefore not optimal imo.

How about `get_group` as in this [answer](https://stackoverflow.com/a/40630950/5276797)? — IanS, Jul 30 '19 at 08:00
@IanS it helps when I know the name of the groups. But it would we way more convenient just to pass a number. — Qaswed, Jul 30 '19 at 08:06
can you create a dict of groups with a factor, example `d={f"group{i}":g for i,g in df.groupby(df.category.factorize()[0])}` and then call each group like `d['group0']` — anky, Jul 30 '19 at 08:12
@anky_91 So you suggest to build a second groupby object using `factorize()`, right? Is it possible to use the existing groupby-object to build such a dictionary? — Qaswed, Jul 30 '19 at 08:38
@Qaswed the dict in my comment is using the dataframe directly, not a second groupby. :) — anky, Jul 30 '19 at 09:03
You can try `unique_cats = df["category"].unique()` and then `df[df["category"] == unique_cats[0]]` and get the result since while using `df.groupby("col").apply(myfunction)` does the same thing iteratively. So there will be no difference. — Ilker Kurtulus, Jul 30 '19 at 11:14
@anky_91 By "second groupby", I mean that it doesn't use the existing `grouped`. — Qaswed, Jul 30 '19 at 12:35
@Qaswed the answer in the comment doesnot use the `grouped` it uses the dataframe from the scratch and does a groupby — anky, Jul 30 '19 at 12:36
@anky_91 Thank you for your suggestion to build the groupby object differently. I clarified my question that I want to know, how one best iterates through **existing** groupby objects. — Qaswed, Jul 31 '19 at 07:04
what do you mean by "manually iterate" *exactly*. you showed an *indexing* operation. Just create a list out of your groupby object. If you want to access it by group label, create a dict. — juanpa.arrivillaga, Jul 31 '19 at 07:15
In the end it's unclear to me why you need this at all. Since you're iterating manually, why not select a group by name rather than by some arbitrary index? A group is just a subset of the dataframe anyway. — IanS, Jul 31 '19 at 08:38
And if you must absolutely use an index, then run the loop `for name, group in grouped` and break when you have reached the number of iterations you're interested in. — IanS, Jul 31 '19 at 08:40

score 2 · Accepted Answer · answered Jul 31 '19 at 08:45

Any iterable (here the GroupBy object) can be turned into an iterator:

group_iter = iter(grouped)

The line below will be the equivalent of selecting the first group (indexed by 0):

name, group = next(group_iter)

To get the next group, just repeat:

name, group = next(group_iter)

And so on...

Source: https://treyhunner.com/2018/02/python-range-is-not-an-iterator/

Is there a simple way to manually iterate through existing pandas groupby objects?

1 Answers1