How to do multiple inferencing on onnx(onnxruntime) similar to sklearn

Question

I want to infer outputs against many inputs from an onnx model using onnxruntime in python. One way is to use the for loop but it seems a very trivial and a slow method. Is there a way to do the same way as sklearn?

Single prediction on onnxruntime:

import onnxruntime as ort
sess = ort.InferenceSession("xxxxx.onnx")
input_name = sess.get_inputs()
label_name = sess.get_outputs()[0].name

pred_onnx= sess.run([label_name], {
    input_name[0].name: np.array([[40]]).astype(np.int64),
    input_name[1].name: np.array([[0]]).astype(np.int64),
    input_name[2].name: np.array([[0]]).astype(np.int64)
})
pred_onnx

>> Output: [array([[23]], dtype=float32)]

Single/Multiple prediction in sklearn(depending on the size of x_test):

test_predictions = model.predict(x_test)

score 0 · Answer 1 · answered Apr 28 '21 at 05:56

Best way is for the ONNX model to support batches. Based on the input you're providing it may already do that. Your 3 inputs appear to have shape [1,1] and your output has shape [1,1], which may mean the first dimension is the batch size. Example input with shape [2,1] (2 batches, 1 element per batch) would look like [[40],[50]].

I'm guessing if you provide two batches would of input you'd get two outputs, so something like this

pred_onnx= sess.run([label_name], {
  input_name[0].name: np.array([[40],[40]]).astype(np.int64),
  input_name[1].name: np.array([[0],[0]]).astype(np.int64),
  input_name[2].name: np.array([[0],[0]]).astype(np.int64)
})

May give output of [array([[23],[23]], dtype=float32)]

Ken Jiiii · Answer 2 · 2022-10-06T11:23:47.243

Here is a small working example using batch inference on a sklearn model exported to ONNX.

from sklearn import datasets, model_selection, linear_model, pipeline, preprocessing
import numpy as np
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType
import onnxruntime
import pandas as pd

# load toy dataset, define sklearn pipeline and fit model
dataset = datasets.load_diabetes()
X, y = dataset.data, dataset.target
X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y)
regr = pipeline.Pipeline(
    [("std", preprocessing.StandardScaler()), ("reg", linear_model.LinearRegression())]
)
regr.fit(X_train, y_train)

# export model to onnx
initial_type = list(
    zip(
        dataset.feature_names,
        [FloatTensorType([None, 1]) for _ in range(len(dataset.feature_names))],
    )
)
onx = convert_sklearn(regr, initial_types=initial_type)
with open("model.onnx", "wb") as f:
    f.write(onx.SerializeToString())

# load model in onnx runtime and make batch inference
df_test = pd.DataFrame(X_test, columns=dataset.feature_names)
sess = onnxruntime.InferenceSession("model.onnx")
inputs = {
    f: df_test[f].astype(np.float32).values.reshape(-1, 1)
    for f in dataset.feature_names
}
label_name = sess.get_outputs()[0].name
pred_onx = sess.run([label_name], inputs)[0]


# compare results
regr.predict(X_test)
pred_onx.flatten()

I think the trickiest part is to get the input shape right for inference. Since we specified FloatTensorType([None, 1]) the shape of the single input arrays must be of shape (x,1) where x is the number of batches. Thus we need to reshape column values of shape (x,) into (x,1).

How to do multiple inferencing on onnx(onnxruntime) similar to sklearn

2 Answers2