4

In pandas it happens automatically, just by calling pd.concat([df1, df2, df3]) and the frame that didn't have the column previously just gets a column filled with NaNs.

In polars I get a 'shape error' with the message that the columns differ (11 cols in df1 vs 12 cols in df2).

Cornelius Roemer
  • 3,772
  • 1
  • 24
  • 55
zacko
  • 179
  • 2
  • 9

1 Answers1

9

Polars cares about schema correctness by default in operations and prefers throwing an error above silently succeeding as it might indicate a bug in your program.

If you want polars to add the columns, add the kwarg how="diagonal" to pl.concat.

df_a = pl.DataFrame({
    "a": [1, 2, 3],
    "b": [True, None, False],
})


df_b = pl.DataFrame({
    "a": [4, 5],
    "c": ["bar", "ham"]
})


pl.concat([df_a, df_b], how="diagonal")
shape: (5, 3)
┌─────┬───────┬──────┐
│ a   ┆ b     ┆ c    │
│ --- ┆ ---   ┆ ---  │
│ i64 ┆ bool  ┆ str  │
╞═════╪═══════╪══════╡
│ 1   ┆ true  ┆ null │
├╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 2   ┆ null  ┆ null │
├╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 3   ┆ false ┆ null │
├╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 4   ┆ null  ┆ bar  │
├╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 5   ┆ null  ┆ ham  │
└─────┴───────┴──────┘

Cornelius Roemer
  • 3,772
  • 1
  • 24
  • 55
ritchie46
  • 10,405
  • 1
  • 24
  • 43
  • 3
    Damn i am blind. As soon as my eyes saw 'diagonal' in the documentation they skipped the paragraph immediately thinking it's something completely different. p.s. i love u (was starting to write spaghetti for loops with .lit() to get the job done) – zacko Jun 16 '22 at 09:01