from datatable import dt, f, g, by, update, join, sort
tt = dt.Frame({'a' : ['A1','A2','A3'], 'b':[100,200,300]})
print(tt)
| a b
-- + -- ---
0 | A1 100
1 | A2 200
2 | A3 300
[3 rows x 2 columns]
How can I remove the 'A' in the a
column and assign it to a new column 'c' as a number in the datatable way (w/o pandas that is)?
It would look like this with the help of pandas
tt['c'] = tt.to_pandas()['a'].str.replace('A','').astype(int)
A datatable native version does not quite work
tt[:, update(c = [int(x.replace('A','')) for x in f.a])]
TypeError: 'datatable.FExpr' object is not iterable
By the way, for a frequent user of python pandas and R data.table, is there an advanced/complete cookbook that can help the transition from R data.table to py-datatable? There is a page on the website, but not quite enough.