2

I followed the instructions to convert BART-LARGE-CNN model to ONNX here (https://github.com/huggingface/transformers/blob/master/docs/source/serialization.rst) using transformers.onnx script. The model was exported fine and I can run inference.

However, the results of the inference, from the 'last_hideen_state' are in logits (I think)? How can I parse this output for summarization purposes?

Here are screenshots of what I've done.

enter image description here

This is the resulting output from those two states:

enter image description here

ZWang
  • 832
  • 5
  • 14
  • Did you find a solution to this? I am also trying to speed up a summarizers inference by using onnx. I have tried `$ python -m transformers.onnx --model=sshleifer/distilbart-cnn-6-6 ` But i get the message `Some weights of the model checkpoint at sshleifer/distilbart-cnn-6-6 were not used when initializing BartModel: ['final_logits_bias']` – S.MC. Dec 14 '21 at 14:09
  • 1
    The reason is that the summarization is done seperately from the actual BART inference. So once you convert the BART model itself, you need to write your own beaming method or such. To my knowledge this is currently not implented in hugging face so you have to do it yourself. – ZWang Dec 21 '21 at 12:25

1 Answers1

0

I have implemented fast-Bart. Which essentially converts Bart model from Pytorch to Onnx- with generate capabilities.

fast-Bart