3

I have trained an object detection model to be used in production for real-time applications. I have the following two options. Can anyone suggest what is the best way to run inference on Jetson Xavier for best performance? Any other suggestions are also welcome.

  1. Convert the model to ONXX format and use with TensorRT
  2. Save the model as Torchscript and run inference in C++
talonmies
  • 70,661
  • 34
  • 192
  • 269
Akshay Kumar
  • 277
  • 2
  • 11

2 Answers2

4

On Jetson hardware, my experience is that using TensorRT is definitely faster. You can convert ONNX models to TensorRT using the ONNXParser from NVIDIA. For optimal performance you can choose to use mixed precision. How to convert ONNX to TensorRT is explained here: TensorRT. Section 3.2.5 for python bindings and Section 2.2.5 for the C++ bindings.

joostblack
  • 2,465
  • 5
  • 14
  • Thank you. I've found [this tutorial](https://github.com/NVIDIA/TensorRT/blob/master/quickstart/IntroNotebooks/4.%20Using%20PyTorch%20through%20ONNX.ipynb) helpful for converting Pytorch models to TensorRT through ONNX – SomethingSomething Nov 07 '22 at 12:34
0

I don't have any experience in Jetson Xavier, but in Jetson Nano TensorRT is a little bit faster than ONNX or pytorch. TorchScript does no make any difference from pyTorch.

Navid Naderi
  • 340
  • 3
  • 5