5

I cannot find how to reorder columns in a polars dataframe in the polars DataFrame docs.

thx

rchitect-of-info
  • 1,150
  • 1
  • 11
  • 23

3 Answers3

12

Using the select method is the recommended way to sort columns in polars.

Example:

Input:

df
┌─────┬───────┬─────┐
│Col1 ┆ Col2  ┆Col3 │
│ --- ┆ ---   ┆ --- │
│ str ┆ str   ┆ str │
╞═════╪═══════╪═════╡
│ a   ┆ x     ┆ p   │
├╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ b   ┆ y     ┆ q   │
└─────┴───────┴─────┘

Output:

df.select(['Col3', 'Col2', 'Col1'])
or
df.select([pl.col('Col3'), pl.col('Col2'), pl.col('Col1)])

┌─────┬───────┬─────┐
│Col3 ┆ Col2  ┆Col1 │
│ --- ┆ ---   ┆ --- │
│ str ┆ str   ┆ str │
╞═════╪═══════╪═════╡
│ p   ┆ x     ┆ a   │
├╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ q   ┆ y     ┆ b   │
└─────┴───────┴─────┘

Note: While df[['Col3', 'Col2', 'Col1']] gives the same result (version 0.14), it is recommended (link) that you use the select method instead.

We strongly recommend selecting data with expressions for almost all use cases. Square bracket indexing is perhaps useful when doing exploratory data analysis in a terminal or notebook when you just want a quick look at a subset of data.

For all other use cases we recommend using expressions because:

  1. expressions can be parallelized
  2. the expression approach can be used in lazy and eager mode while the indexing approach can only be used in eager mode
  3. in lazy mode the query optimizer can optimize expressions
NFern
  • 1,706
  • 17
  • 18
5

That seems like a special case of projection to me.

df = pl.DataFrame({
    "c": [1, 2],
    "a": ["a", "b"],
    "b": [True, False]
})

df.select(sorted(df.columns))
shape: (2, 3)
┌─────┬───────┬─────┐
│ a   ┆ b     ┆ c   │
│ --- ┆ ---   ┆ --- │
│ str ┆ bool  ┆ i64 │
╞═════╪═══════╪═════╡
│ a   ┆ true  ┆ 1   │
├╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┤
│ b   ┆ false ┆ 2   │
└─────┴───────┴─────┘


ritchie46
  • 10,405
  • 1
  • 24
  • 43
4

Turns out it is the same as pandas:

df = df[['PRODUCT', 'PROGRAM', 'MFG_AREA', 'VERSION', 'RELEASE_DATE', 'FLOW_SUMMARY', 'TESTSUITE', 'MODULE', 'BASECLASS', 'SUBCLASS', 'Empty', 'Color', 'BINNING', 'BYPASS', 'Status', 'Legend']]
rchitect-of-info
  • 1,150
  • 1
  • 11
  • 23