0

As stated in the title I'm trying to parallelize the prediction of multiple (sequential) keras models.

The challenge is that I have ~200 small models (each for a different signal, each trained individually - so combining them to a larger model is off the table). These models are to be used for predictions with a relatively fast cycle-time.

Currently they run in a for-loop like so:

start_time = perf_counter()
 
for model in modelList:
                
    input = getModelInput() # example dummy for extracting model-specific input data

    prediction = model(input, training=False).numpy()

end_time = perf_counter()

print(f'Prediction loop execution: {(end_time - start_time) * 1000:.2f} ms')

With this code the whole execution takes around 1.000 ms.

After researching the topic (including multiprocessing which is used in other parts of the project) I tried to use a multiprocessing.Pool. Unfortunately this lead to other issues ...

Trying to pass the model with a multiprocessing.Manager like

# simplified code example
manager = multiprocessing.Manager()
modelList = manager.list()

modelList.append(model)

lead to the terminal output

INFO:tensorflow:Assets written to: ram://4893b60f-addb-42cc-8f73-ecd08f4345f1/assets

and took forever ...

Next I tried a workaround I have found to load the model within the process

def poolPrediction(modelPath: str):
   model = load_model(modelPath, compile=False)

   input = getModelInput()
   
   prediction = model(input, training=False).numpy()

   return prediction

start_time = perf_counter()
 
with multiprocessing.Pool(4) as predictionPool:
                
    results = predictionPool.map_async(poolPrediction, modelPaths).get()

end_time = perf_counter()

print(f'Prediction loop execution: {(end_time - start_time) * 1000:.2f} ms')

it takes even longer (~1.700 ms).

I'm sure I'm missing something huge and obvious here and therefore I'm grateful for the smallest hint or idea!

Viktor Katzy
  • 162
  • 10

0 Answers0