I have a model that consists 150 models (runs in for loop). In order to be performance oriented, I would like to split it into 150 models, that for every request my server gets it will send 150 api requests to every different model and then combine the result (so that the invocations will run parallely). So called map reduce
I thought about AWS SageMaker multi model but it says that the use case is better for serial running more than parallel or concurrent run.
In addition, I thought about maybe creating lambda function that will read the model and scale accordingly (serverless), but it sounds very odd to me and that I miss SageMaker's usecases.
Thanks!