Bart model inference results after converting from hugginface to onnx

Question

I followed the instructions to convert BART-LARGE-CNN model to ONNX here (https://github.com/huggingface/transformers/blob/master/docs/source/serialization.rst) using transformers.onnx script. The model was exported fine and I can run inference.

However, the results of the inference, from the 'last_hideen_state' are in logits (I think)? How can I parse this output for summarization purposes?

Here are screenshots of what I've done.

This is the resulting output from those two states:

Did you find a solution to this? I am also trying to speed up a summarizers inference by using onnx. I have tried `$ python -m transformers.onnx --model=sshleifer/distilbart-cnn-6-6 ` But i get the message `Some weights of the model checkpoint at sshleifer/distilbart-cnn-6-6 were not used when initializing BartModel: ['final_logits_bias']` — S.MC., Dec 14 '21 at 14:09
The reason is that the summarization is done seperately from the actual BART inference. So once you convert the BART model itself, you need to write your own beaming method or such. To my knowledge this is currently not implented in hugging face so you have to do it yourself. — ZWang, Dec 21 '21 at 12:25

score 0 · Answer 1 · answered Feb 21 '22 at 18:51

0

I have implemented fast-Bart. Which essentially converts Bart model from Pytorch to Onnx- with generate capabilities.

answered Feb 21 '22 at 18:51

siddharth.sharma

1 Answers1