0

Let’s say I want to groupby A and B and calc the sum of Sales? How should I go about it?

import pandas as pd
import ray
ray.init()

rdf = ray.data.from_pandas(pd.DataFrame({'A':[1,2,3],'B':[1,1,4],'Sales':[20,30,40]})

I did try doing rdf.groupby([‘A’,‘B’]) but it gives me an error stating the key column must not be a list.

Justice_Lords
  • 949
  • 5
  • 14

1 Answers1

2

This is not supported yet, but it's planned to add: https://github.com/ray-project/ray/issues/25732

The workaround is adding a new column to the dataset, say "C", which combines "A" and "B" in the way you want, and then groupby column "C".

jianxiao
  • 66
  • 1