I want to convert HuggingFace-Transformers deepset/roberta-base-squad2 model for question answering into TensorRT to boost inference speed and performance of the engine. Since the model uses pytorch framework, I converted the model into ONNX format and for question answering. Now I want to convert the onnx model to TensorRT Engine and run inference on it. Is there any method specific to Transformers Roberta Q/A model to convert it to TenorRT and perform inference?
I used the following code to convert the model to ONNX format: `
python -m transformers.onnx --model=deepset/roberta-base-squad2 --feature=question-answering onnx/
`
It generated the following result: Result for Onnx Conversion
Now I want to convert this Onnx model to TensorRT and run inference in it. Do we have any code specific to Deepset-roberta model for question answering?