4

In pandas we have the pandas.DataFrame.select_dtypes method that selects certain columns depending on the dtype. Is there a similar way to do such a thing in Polars?

astrojuanlu
  • 6,744
  • 8
  • 45
  • 105

2 Answers2

8

One can pass a data type to pl.col:

import polars as pl

df = pl.DataFrame(
    {
        "id": [1, 2, 3],
        "name": ["John", "Jane", "Jake"],
        "else": [10.0, 20.0, 30.0],
    }
)
print(df.select([pl.col(pl.Utf8), pl.col(pl.Int64)]))

Output:

shape: (3, 2)
┌──────┬─────┐
│ name ┆ id  │
│ ---  ┆ --- │
│ str  ┆ i64 │
╞══════╪═════╡
│ John ┆ 1   │
├╌╌╌╌╌╌┼╌╌╌╌╌┤
│ Jane ┆ 2   │
├╌╌╌╌╌╌┼╌╌╌╌╌┤
│ Jake ┆ 3   │
└──────┴─────┘
astrojuanlu
  • 6,744
  • 8
  • 45
  • 105
  • 1
    Adding to the discussion, you can use `df.select(pl.col(pl.NUMERIC_DTYPES))` to select all numeric columns. I'm looking for the way to select non numeric columns now. – Ludecan Jul 04 '23 at 22:35
2

Starting from Polars 0.18.1 You can use polars.selectors.by_dtype selector to select all columns matching the given dtypes.

>>> import polars as pl
>>> import polars.selectors as cs
>>> 
>>> df = pl.DataFrame(
...     {
...         "id": [1, 2, 3],
...         "name": ["John", "Jane", "Jake"],
...         "else": [10.0, 20.0, 30.0],
...     }
... )
>>> 
>>> print(df.select(cs.by_dtype(pl.Utf8, pl.Int64)))
shape: (3, 2)
┌─────┬──────┐
│ id  ┆ name │
│ --- ┆ ---  │
│ i64 ┆ str  │
╞═════╪══════╡
│ 1   ┆ John │
│ 2   ┆ Jane │
│ 3   ┆ Jake │
└─────┴──────┘

To select all non-numeric type columns:

>>> import polars as pl
>>> import polars.selectors as cs
>>> 
>>> df = pl.DataFrame(
...     {
...         "id": [1, 2, 3],
...         "name": ["John", "Jane", "Jake"],
...         "else": [10.0, 20.0, 30.0],
...     }
... )
>>> 
>>> print(df.select(~cs.by_dtype(pl.NUMERIC_DTYPES)))
>>> # OR print(df.select(~cs.numeric()))
shape: (3, 1)
┌──────┐
│ name │
│ ---  │
│ str  │
╞══════╡
│ John │
│ Jane │
│ Jake │
└──────┘
Abdul Niyas P M
  • 18,035
  • 2
  • 25
  • 46