0

How can I convert this python datatable

have = dt.Frame(id=[1,1,2,2],val=["a","b","c","d"])

into this python datatable?

want = dt.Frame(id = [1,2], val=[["a","b"],["c","d"]], types=[dt.Type.int32,dt.Type.obj64])

Specifically, I am trying to find the datatable equivalent of this operation in pandas:

have = pd.DataFrame({"id":[1,1,2,2],"val":["a","b","c","d"]})
have.groupby("id")["val"].apply(list).reset_index()

   id     val
0   1  [a, b]
1   2  [c, d]
langtang
  • 22,248
  • 1
  • 12
  • 27
  • 1
    you are likely referring to [array columns](https://github.com/h2oai/datatable/issues/1692). It is still a work in progress though, so I am not sure if this specific operation has been implemented yet. – sammywemmy Apr 06 '22 at 21:16

1 Answers1

0

Here is one option, but I'm hoping someone can help with a better solution:

dt.rbind([dt.Frame(
    id = i,
    val=have[f.id==i,f.val].to_list(),
    types=[dt.Type.int32, dt.Type.obj64]
) for i in np.unique(have["id"])])
langtang
  • 22,248
  • 1
  • 12
  • 27