How do I convert a huggingface-transformers deepset/roberta-base-squad2 model for question answering to TensorRT engine and run Inference on it?

Asked Nov 24 '22 at 06:08

Active Nov 24 '22 at 06:08

Viewed 232 times

I want to convert HuggingFace-Transformers deepset/roberta-base-squad2 model for question answering into TensorRT to boost inference speed and performance of the engine. Since the model uses pytorch framework, I converted the model into ONNX format and for question answering. Now I want to convert the onnx model to TensorRT Engine and run inference on it. Is there any method specific to Transformers Roberta Q/A model to convert it to TenorRT and perform inference?

I used the following code to convert the model to ONNX format: `

python -m transformers.onnx --model=deepset/roberta-base-squad2 --feature=question-answering onnx/

It generated the following result: Result for Onnx Conversion

Now I want to convert this Onnx model to TensorRT and run inference in it. Do we have any code specific to Deepset-roberta model for question answering?

asked Nov 24 '22 at 06:08

Naman Jain

How do I convert a huggingface-transformers deepset/roberta-base-squad2 model for question answering to TensorRT engine and run Inference on it?

0 Answers0