How can I combine a Huggingface tokenizer and a BERT-based model in onnx?

Question

Problem description:

I have a model based on BERT, with a classifier layer on top. I want to export it to ONNX, but to avoid issues on the side of the 'user' of the onnx model, I want to export the entire pipeline, including tokenization, as a ONNX model. However, this requires a basic string as input type, which I believe ONNX does not support.

The Model:

class BertClassifier(nn.Module):
    """
    Class defining the classifier model with a BERT encoder and a single fully connected classifier layer.
    """
    def __init__(self, dropout=0.5, num_labels=24):
        super(BertClassifier, self).__init__()

        self.bert = BertModel.from_pretrained('bert-base-uncased')
        self.dropout = nn.Dropout(dropout)
        self.linear = nn.Linear(768, num_labels)
        self.relu = nn.ReLU()
        self.best_score = 0

    def forward(self, input_id, mask):
        _, pooled_output = self.bert(input_ids=input_id, attention_mask=mask, return_dict=False)
        output = self.relu(self.linear(self.dropout(pooled_output)))

        return output

The Tokenizer:

def get_tokenizer(chosen_model):
    # chosen_model = 'bert_base_uncased'
    return AutoTokenizer.from_pretrained(chosen_model)

Combined Pipeline:

class OnnxBertModel(nn.Module):
    """
    Model wrapper for onnx. Allows user to only provide a string as input. Output is a list of class probabilities
    """
    def __init__(self, dropout=0.5, num_labels=24):
        super(OnnxBertModel, self).__init__()

        self.bert = BertModel.from_pretrained('bert-base-uncased')
        self.dropout = nn.Dropout(dropout)
        self.linear = nn.Linear(768, num_labels)
        self.relu = nn.ReLU()
        self.best_score = 0
        self.tokenizer = get_tokenizer('bert-base-uncased')

    def forward(self, input_string):
        input_tokens = self.tokenizer(input_string,
                       padding='max_length', max_length=512, truncation=True,
                       return_tensors="pt")
        mask = input_tokens['attention_mask']
        input_id = input_tokens['input_ids'].squeeze(1)
        _, pooled_output = self.bert(input_ids=input_id, attention_mask=mask, return_dict=False)
        output = self.relu(self.linear(self.dropout(pooled_output)))
        return output

Additional code to export:

model = OnnxBertModel(num_labels=len(labels))
torch.onnx.export(model, ex_string, 'tryout.onnx', export_params=True, do_constant_folding=False)

The last call does not work due to the string typing.

What type of error are you getting when you do the string type input? My model was able to export if I converted the output to a string in the forward method, but getting errors further down stream when I do the inference runtime. — Matt, Apr 29 '22 at 19:43

How can I combine a Huggingface tokenizer and a BERT-based model in onnx?

0 Answers0