I have a pydatatable as,
DT = dt.Frame(
A=[1, 3, 2, 1, 4, 2, 1],
B=['A','B','C','A','D','B','A'],
C=['myamulla','skumar','cary','myamulla','api','skumar','myamulla'])
Out[7]:
| A B C
-- + -- -- --------
0 | 1 A myamulla
1 | 3 B skumar
2 | 2 C cary
3 | 1 A myamulla
4 | 4 D api
5 | 2 B skumar
6 | 1 A myamulla
[7 rows x 3 columns]
I'm trying to filter out the duplicate rows as
DT[:, first(f[1:]), by([f[0],f[1],f[2]])]
Its giving an output as-
Out[10]:
| A B C B.0 C.0
-- + -- -- -------- --- --------
0 | 1 A myamulla A myamulla
1 | 2 B skumar B skumar
2 | 2 C cary C cary
3 | 3 B skumar B skumar
4 | 4 D api D api
[5 rows x 5 columns]
Here it has removed the duplicate observation and why it is creating the duplicate columns on B and C as B.0 C.0 ?