RoBERTa classifier: cannot generate single prediction

Question

I have succesfully trained a text emotion classifier fine-tuning a RoBERTa language model, mostly using a helpful notebook found online. Now I am trying to write a function to generate the prediction for a single sample (sentence), but can't seem to do it.

My model class is like this:

class ClassificationModel(nn.Module):
    def __init__(self, base_model, n_classes, base_model_output_size=768, dropout=0.05):
        super().__init__()
        self.base_model = base_model
        
        self.classifier = nn.Sequential(
            nn.Dropout(dropout),
            nn.Linear(base_model_output_size, base_model_output_size),
            Mish(),
            nn.Dropout(dropout),
            nn.Linear(base_model_output_size, n_classes)
        )
        
        for layer in self.classifier:
            if isinstance(layer, nn.Linear):
                layer.weight.data.normal_(mean=0.0, std=0.02)
                if layer.bias is not None:
                    layer.bias.data.zero_()

    def forward(self, input_, *args):
        X, attention_mask = input_
        hidden_states = self.base_model(X, attention_mask=attention_mask)
        
        return self.classifier(hidden_states[0][:, 0, :])

And once I create an instance of this class, I am able to succesfully load the state dictionary of my fine-tuned weights. But my function for classifying a single sample won't work. The line defining the output returns an error about the wrong amount of values to unpack.

def get_label(text):
  text = re.sub(r'[^\w\s]', '', text)
  text = text.lower()

  t = ByteLevelBPETokenizer(
            "tokenizer/vocab.json",
            "tokenizer/merges.txt"
        )
  t._tokenizer.post_processor = BertProcessing(
            ("</s>", t.token_to_id("</s>")),
            ("<s>", t.token_to_id("<s>")),
        )
  t.enable_truncation(512)
  t.enable_padding(pad_id=t.token_to_id("<pad>"))
  tokenizer = t

  encoded = tokenizer.encode(text)
  sequence_padded = torch.tensor(encoded.ids)

  attention_mask_padded = torch.tensor(encoded.attention_mask) 
   

  output = model((sequence_padded, attention_mask_padded)) #this does not work

  dec = [tokenizer.decode(ids) for ids in output] #not sure if this will work 
  label = dec[0]
  return label

My impression is that the forward function of the model takes the (tokenized) input and the attention mask, but when I try to call the function

get_label('this is text to classify')

I get the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-34-9a58782b060e> in <module>()
----> 1 get_label('this is text to classify')

4 frames
<ipython-input-33-bc7c6292a9bf> in get_label(text)
     23 
     24 
---> 25   output = model((sequence_padded, attention_mask_padded)) #this does not work
     26 
     27   dec = [tokenizer.decode(ids) for ids in output] #not sure if this will work

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

<ipython-input-8-275361c034f8> in forward(self, input_, *args)
     22     def forward(self, input_, *args):
     23         X, attention_mask = input_
---> 24         hidden_states = self.base_model(X, attention_mask=attention_mask)
     25 
     26         return self.classifier(hidden_states[0][:, 0, :])

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.7/dist-packages/transformers/models/roberta/modeling_roberta.py in forward(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask, past_key_values, use_cache, output_attentions, output_hidden_states, return_dict)
    790         elif input_ids is not None:
    791             input_shape = input_ids.size()
--> 792             batch_size, seq_length = input_shape
    793         elif inputs_embeds is not None:
    794             input_shape = inputs_embeds.size()[:-1]

ValueError: not enough values to unpack (expected 2, got 1)

I am at a loss as to what I should do. It seems that I am not feeding the correct thing to the forward function of my model. I am just starting out and any help would be greatly appreciated. Thanks!

Just guessing --- Don't you need to add the batch dimension to your input? — Kota Mori, Aug 24 '21 at 17:59
right, but how would that be? Do I need to pass both sequence_padded and attention_mask_padded to the forward function? I thought that because I am not using the dataloader or anything, it would not expect a batch? — user14501128, Aug 24 '21 at 18:53
Thank you @cronoik , I have added to the post the function call with the text input and the full error I get, if you could be so kind to have a look. Thank you — user14501128, Aug 25 '21 at 02:06
thank you @KotaMori , I tried out putting the text in a list. unfortunately that did not work for me, it tells me that the tokenizer can only encode a string — user14501128, Aug 25 '21 at 02:08
Probably adding another dimension to the input tensors does the trick! you have passed 1D tensor but it should be 2D! use `torch.unsqueeze()` — meti, Aug 25 '21 at 04:23

RoBERTa classifier: cannot generate single prediction

0 Answers0