I would like to identify the duplicated rows in a py-dtatable by group (and create a helper column C with a bool).
It should work along the lines of this:
DT = dt.Frame(A=[1, 2, 1, 2, 2, 1], B=list("XXYYYY"))
I get -> TypeError: Expected a Frame, instead got class 'datatable.expr.expr.Expr' when i'm applying the grouping on it to find out the unique observations for a group.
However, unique() doesn't not work and the documentation on the available functions for py-datatable is pretty sparse: https://datatable.readthedocs.io/en/v0.10.1/using-datatable.html#perform-groupby-calculations
I'm not sure if py-datatable is that much behind R datatable and its not possible as it seems like a basic operation but I cant find the solution. Does someone have it or can point me in the direction of resources please? Ideally this would include the syntax with the assignment of the bool(duplicate or not) in a new column C in one line of code.