I have a pandas df that's indexed by id
and date
. I would like to run some regressions for each id in parallel using dask. I know dask splits the df into N partitions but is there a way to force it to split by id
column? This way when I do map_partitions
I can simply apply my rolling regression function to each partition.
Asked
Active
Viewed 259 times
3

Alex
- 1,281
- 1
- 13
- 26