How to select columns based on their data types in pydatatable?

Question

I'm creating a datatable as follows,

spotify_songs_dt = dt.fread('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-01-21/spotify_songs.csv')

and its column types are,

spotify_songs_dt.stypes

Here I would like to take out only numeric fields of DT, and how can it be achieved in a datatable way?. In pandas dataframe we have a kind of function select_dtypes() for it.

score 2 · Accepted Answer · answered Jan 27 '20 at 19:02

If you have a frame DT, then the most straightforward way to select columns of a specific type is to use the type itself in the DT[:,j] selector:

DT[:, bool]          # all boolean columns
DT[:, int]           # all integer columns
DT[:, float]         # all floating columns
DT[:, str]           # string columns
DT[:, dt.int32]      # columns with stype int32
DT[:, dt.ltype.int]  # columns with ltype `int`, same as DT[:, int]

It is also possible to provide a list of types to select:

DT[:, [int, float]]          # integer and floating columns
DT[:, [dt.int32, dt.int64]]  # int32 and int64 columns

Sometimes it may also be useful to delete the columns of the undesirable type instead of selecting the ones you need:

del DT[:, str]

Now I got it and thanks for that answer. It’s a simple one, I missed it. — myamulla_ciencia, Jan 27 '20 at 21:01

How to select columns based on their data types in pydatatable?

1 Answers1