I was wondering if someone could please enlighten me.
I am trying to cumulatively sum pty_nber over/groupby a specific column (Declaration).
My original idea was to use something along:
dataset.filter(pl.col("pty_nber").first().over("Declaration").cumsum() < 30 )
But unfortunately, it does not take into account the .over() and just cumulatively sums all the rows. So rather than summing 4 + 7 + 8 etc.., it sums it 4 + 4 + 4 + 4 + 7 ...
The goal is to show at least a few complete declarations and not cut in the middle.
Thanks in advance :)
As an example please see below:
--> and filter out for CUMSUM that are over a certain threshold such as 30 so that I make sure that no ONE declaration is not complete (i.e. not including all the pty_nber for that specific declaration)