I have a task where I need to run the same function on many different pandas dataframes. I load all the dataframes into a list then pass it to Pool.map
using the multiprocessing
module. The function code itself has been vectorized as much as possible, contains a few if/else clauses and no matrix operations.
I'm currently using a 10-core xeon and would like to speed things up, ideally passing from Pool(10)
to Pool(xxx)
. I see two possibilities:
GPU processing. From what I have read though I'm not sure if I can achieve what I want and would in any case need lots of code modification.
Xeon-Phi. I know it's being discontinued, but supposedly code adaptation is easier and if thats really the case I'd happily get one.
Which path should I concentrate on? Any other alternatives?
Software: Ubuntu 18.04, Python 3.7. Hardware: X99 chipset, 10-core xeon (no HT)