I have trained YOLO-v3 tiny on my custom dataset using PyTorch. For comparing the inferencing time, I tried onnxruntime on CPU along with PyTorch GPU and PyTorch CPU. The average running times are around:
onnxruntime cpu: 110 ms - CPU usage: 60%
Pytorch GPU: 50 ms
Pytorch CPU: 165 ms - CPU usage: 40%
and all models are working with batch size 1.
However, I don't understand how onnxruntime is faster compared to PyTorch CPU as I have not used any optimization options of onnxruntime. I just used this:
onnx_model = onnxruntime.InferenceSession('model.onnx')
onnx_model.run(None,{onnx_model.get_inputs()[0].name: input_imgs })
Can someone explain to me why is it faster without any optimization? and also why CPU usage is higher while using onnxruntime. is there any solution to keep it down?
Thanks in advance.