Consider:
I want to pool data like the figure above, but it takes too much time and RAM usage.
Can I make it faster / efficient?
My code is like this:
data = df.groupby(['Name', 'Age', 'Pet', 'Allergy']).apply(lambda x: pd.Series(range(x['Amount'].squeeze()))).reset_index()
data = df.groupby(['Name', 'Age', 'Pet', 'Allergy']).apply(lambda x: pd.Series(range(x['Amount'].squeeze()))).reset_index()[['Name', 'Age', 'Pet', 'Allergy']]
It's kind of an abbreviated form, but my actual dataset is 3.5 GB... So it takes a really long time. Is there another way to do this work faster?