0

Suppose I create a polars Lazyframe from a list of csv files using pl.concat():

df = pl.concat([pl.scan_csv(file) for file in ['file1.csv', 'file2.csv']])

Is the data in the resulting dataframe guaranteed to have the exact order of the input files, or could there be a scenario where the query optimizer would mix things up?

DataWiz
  • 401
  • 6
  • 14

1 Answers1

1

The order is maintained. The engine may execute them in a different order, but the final result will always have the same order as the lazy computations provided by the caller.

ritchie46
  • 10,405
  • 1
  • 24
  • 43