Q1: In polars-rust, when you do
.groupby().agg()
, we can use.head(10)
to get the first 10 elements in a column. But if the groups have different lengths and I need to get first20%
elements in each group (like 0-24 elements in a 120 elements group). How to make it work?Q2: with a dataframe sample like below, my goal is to loop the dataframe. Beacuse polars is column major, so I downcasted df into serval ChunkedArrays and iterated via iter().zip().I found it is faster than the same action after goupby(col("date")) which is loop some list elemnts. How is that? In my opinion, the length of df is shorter after groupby, which means a shorter loop.
Date | Stock | Price |
---|---|---|
2010-01-01 | IBM | 1000 |
2010-01-02 | IBM | 1001 |
2010-01-03 | IBM | 1002 |
2010-01-01 | AAPL | 2900 |
2010-01-02 | AAPL | 2901 |
2010-01-03 | AAPL | 2902 |