0

I stored some financial market data in a Polars DataFrame. As for analysis, it is is fast to run some groupby("date").agg() action.

But in a realtime scenario , the new data is coming time by time, I don't want to concat the new data with old data again and again, it is slow and use a lot of memory. So is there a blazing fast way to spilt the old data DataFrame into small DataFrame groupby datetime column which stored in a vector or hashmap, so when the new data comes, I just push the new into vector for future calculation?

Hakase
  • 211
  • 1
  • 12

1 Answers1

1

Polars has a DataFrame::partition_by function for this.

ritchie46
  • 10,405
  • 1
  • 24
  • 43