3

I have a pandas df that's indexed by id and date. I would like to run some regressions for each id in parallel using dask. I know dask splits the df into N partitions but is there a way to force it to split by id column? This way when I do map_partitions I can simply apply my rolling regression function to each partition.

Alex
  • 1,281
  • 1
  • 13
  • 26

0 Answers0