Torchscript vs TensorRT for real time inference

Question

I have trained an object detection model to be used in production for real-time applications. I have the following two options. Can anyone suggest what is the best way to run inference on Jetson Xavier for best performance? Any other suggestions are also welcome.

Convert the model to ONXX format and use with TensorRT
Save the model as Torchscript and run inference in C++

score 4 · Answer 1 · answered Feb 10 '21 at 14:14

4

On Jetson hardware, my experience is that using TensorRT is definitely faster. You can convert ONNX models to TensorRT using the ONNXParser from NVIDIA. For optimal performance you can choose to use mixed precision. How to convert ONNX to TensorRT is explained here: TensorRT. Section 3.2.5 for python bindings and Section 2.2.5 for the C++ bindings.

answered Feb 10 '21 at 14:14

joostblack

2,465
5
14

Thank you. I've found [this tutorial](https://github.com/NVIDIA/TensorRT/blob/master/quickstart/IntroNotebooks/4.%20Using%20PyTorch%20through%20ONNX.ipynb) helpful for converting Pytorch models to TensorRT through ONNX – SomethingSomething Nov 07 '22 at 12:34

score 0 · Answer 2 · answered Aug 02 '22 at 10:29

0

I don't have any experience in Jetson Xavier, but in Jetson Nano TensorRT is a little bit faster than ONNX or pytorch. TorchScript does no make any difference from pyTorch.

answered Aug 02 '22 at 10:29

Navid Naderi

340
3
5

Torchscript vs TensorRT for real time inference

2 Answers2