When using concurrent.futures in Python for large data sets ( 9x200,000x4 np.floats ) I've noticed that CPU usage is low (13% ~ equivalent to 1 core being used) for the beginning. However after a while it shoots up to what I expect for multiprocessing (80-90%). Here's a snippet of my code, if interested.
sections = np.arange(0,9,1)
section_result = []
sectioned_results = []
if __name__ == "__main__":
plt.close()
with concurrent.futures.ProcessPoolExecutor() as executor:
generatorlist = [executor.map(makeimage,sections) for _ in range(num_particles)]
for generator in generatorlist:
for item in generator:
section_result.append(item)
Does anyone know the purpose of this? It seems to be an exponential rise in time taken on this 1 core doing something as my number of particles increases. My first thought was allocation of memory as I anticipate this run to take up around 1-1.5GB, but there doesn't seem to be anything in python docs about this process, and I wonder whether I've implemented the module incorrectly. I've tested this on relatively low data sets (10,000 - 100,000) and have definitely seen a rise in duration of 1 core being used.
Many thanks
A