ML.Net: ONNX Model with multiple outputs - bad inference time

Question

I want to do inference on an Onnx model which has one input tensor and multiple output tensors (with different dimensions) with ML.Net and onnxruntime. I used .GetColumn to get the desired output. In order to get all outputs I tried two different approaches:

1) foreach + calling .GetColumn multiple times:

foreach (var output in ModelOutput)
{
    IEnumerable<float[]> column = scoredData.GetColumn<float[]>(output);
    all = all.Concat(column);                              
}

2) Concatenate outputs into one tensor (when defining my pipeline):

.Append(mlContext.Transforms.ApplyOnnxModel(modelFile: modelLocation, outputColumnNames: ModelOutput, inputColumnNames: ModelInput))
.Append(mlContext.Transforms.Concatenate("all_outs", ModelOutput));

Both approaches result in very bad inference times. For example my model needs 250ms for one output tensor and 2500ms for ten tensors. Inference time multiplies itself depending on the number of outputs. When using the same model in a Python script I takes less than 100ms to get all output tensors in one list!

My Questions:

Is there another way to get multiple outputs in ML.Net?
Why is the inference time multiplied?

ML.Net: ONNX Model with multiple outputs - bad inference time

0 Answers0