In pandas we have the pandas.DataFrame.select_dtypes
method that selects certain columns depending on the dtype
. Is there a similar way to do such a thing in Polars?
Asked
Active
Viewed 4,233 times
4

astrojuanlu
- 6,744
- 8
- 45
- 105
2 Answers
8
One can pass a data type to pl.col
:
import polars as pl
df = pl.DataFrame(
{
"id": [1, 2, 3],
"name": ["John", "Jane", "Jake"],
"else": [10.0, 20.0, 30.0],
}
)
print(df.select([pl.col(pl.Utf8), pl.col(pl.Int64)]))
Output:
shape: (3, 2)
┌──────┬─────┐
│ name ┆ id │
│ --- ┆ --- │
│ str ┆ i64 │
╞══════╪═════╡
│ John ┆ 1 │
├╌╌╌╌╌╌┼╌╌╌╌╌┤
│ Jane ┆ 2 │
├╌╌╌╌╌╌┼╌╌╌╌╌┤
│ Jake ┆ 3 │
└──────┴─────┘

astrojuanlu
- 6,744
- 8
- 45
- 105
-
1Adding to the discussion, you can use `df.select(pl.col(pl.NUMERIC_DTYPES))` to select all numeric columns. I'm looking for the way to select non numeric columns now. – Ludecan Jul 04 '23 at 22:35
2
Starting from Polars 0.18.1 You can use polars.selectors.by_dtype
selector to select all columns matching the given dtypes.
>>> import polars as pl
>>> import polars.selectors as cs
>>>
>>> df = pl.DataFrame(
... {
... "id": [1, 2, 3],
... "name": ["John", "Jane", "Jake"],
... "else": [10.0, 20.0, 30.0],
... }
... )
>>>
>>> print(df.select(cs.by_dtype(pl.Utf8, pl.Int64)))
shape: (3, 2)
┌─────┬──────┐
│ id ┆ name │
│ --- ┆ --- │
│ i64 ┆ str │
╞═════╪══════╡
│ 1 ┆ John │
│ 2 ┆ Jane │
│ 3 ┆ Jake │
└─────┴──────┘
To select all non-numeric type columns:
>>> import polars as pl
>>> import polars.selectors as cs
>>>
>>> df = pl.DataFrame(
... {
... "id": [1, 2, 3],
... "name": ["John", "Jane", "Jake"],
... "else": [10.0, 20.0, 30.0],
... }
... )
>>>
>>> print(df.select(~cs.by_dtype(pl.NUMERIC_DTYPES)))
>>> # OR print(df.select(~cs.numeric()))
shape: (3, 1)
┌──────┐
│ name │
│ --- │
│ str │
╞══════╡
│ John │
│ Jane │
│ Jake │
└──────┘

Abdul Niyas P M
- 18,035
- 2
- 25
- 46