Codes worked fine one week ago, but keep getting error since yesterday: Fine-tuning Bert model training via PyTorch on Colab

Question

I am new to Bert. Two weeks ago I successfully ran a fine-tuning Bert model on a nlp classification task though the outcome was not brilliant. Yesterday, however, when I tried to run the same code and data, an AttributeError was always there, which says: 'str' object has no attribute 'dim'. Please know everything is on Colab and via PyTorch Transformers. What should I do to fix it?

Here is one thing I tried when I installed transformers but turned out it did not work: instead of !pip install transformers , I tried to use previous transformers version: !pip install --target lib --upgrade transformers==3.5.0

Any feedback will be greatly appreciated!

Please see the code and the error message as below:

Code:

train definition

# function to train the model
def train():
  
  model.train()

  total_loss, total_accuracy = 0, 0
  
  # empty list to save model predictions
  total_preds=[]
  
  # iterate over batches
  for step,batch in enumerate(train_dataloader):
    
    # progress update after every 50 batches.
    if step % 200 == 0 and not step == 0:
      print('  Batch {:>5,}  of  {:>5,}.'.format(step, len(train_dataloader)))

    # push the batch to gpu
    batch = [r.to(device) for r in batch]
 
    sent_id, mask, labels = batch

    # clear previously calculated gradients 
    model.zero_grad()        

    # get model predictions for the current batch
    preds = model(sent_id, mask)

    # compute the loss between actual and predicted values
    loss = cross_entropy(preds, labels)

    # add on to the total loss
    total_loss = total_loss + loss.item()

    # backward pass to calculate the gradients
    loss.backward()

    # clip the the gradients to 1.0. It helps in preventing the exploding gradient problem
    torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)

    # update parameters
    optimizer.step()

    # update learning rate schedule
    # scheduler.step()  

    # model predictions are stored on GPU. So, push it to CPU
    preds=preds.detach().cpu().numpy()

    # append the model predictions
    total_preds.append(preds)

  # compute the training loss of the epoch
  avg_loss = total_loss / len(train_dataloader)
  
  # predictions are in the form of (no. of batches, size of batch, no. of classes).
  # reshape the predictions in form of (number of samples, no. of classes)
  total_preds  = np.concatenate(total_preds, axis=0)

  #returns the loss and predictions
  return avg_loss, total_preds

training process

# set initial loss to infinite
best_valid_loss = float('inf')

# empty lists to store training and validation loss of each epoch
train_losses=[]
valid_losses=[]

#for each epoch
for epoch in range(epochs):
     
    print('\n Epoch {:} / {:}'.format(epoch + 1, epochs))
    
    #train model
    train_loss, _ = train()
    
    #evaluate model
    valid_loss, _ = evaluate()
    
    #save the best model
    if valid_loss < best_valid_loss:
        best_valid_loss = valid_loss
        torch.save(model.state_dict(), 'saved_weights.pt')
    
    # append training and validation loss
    train_losses.append(train_loss)
    valid_losses.append(valid_loss)
    
    print(f'\nTraining Loss: {train_loss:.3f}')
    print(f'Validation Loss: {valid_loss:.3f}')

Error message:

 Epoch 1 / 10
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-41-c5138ddf6b25> in <module>()
     12 
     13     #train model
---> 14     train_loss, _ = train()
     15 
     16     #evaluate model

5 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in linear(input, weight, bias)
   1686         if any([type(t) is not Tensor for t in tens_ops]) and has_torch_function(tens_ops):
   1687             return handle_torch_function(linear, tens_ops, input, weight, bias=bias)
-> 1688     if input.dim() == 2 and bias is not None:
   1689         # fused op is marginally faster
   1690         ret = torch.addmm(bias, input, weight.t())

AttributeError: 'str' object has no attribute 'dim'

@Bei Zhao Hi Bei, I saw your question and answer about the same error as mine. Due to lack of reputation, I am not allowed to post a comment on your post. I hope you can also see my message. In your post, you mentioned 'change my version into the old one fix the issue', but I tried in your way but still failed. Can you please share your code which fixed your problem? Very much appreciated! — chen256, Dec 02 '20 at 02:30
Please post the train() function together. Also, it would we clear if you provided the colab link. — krenerd, Dec 02 '20 at 05:26
@krenerd Thank you so much for responding. Please see the train definition code included in my post. I am not very sure which line in the train function leads the error. I found a very similar post (Q&A from Bei Zhao), please see it at https://stackoverflow.com/questions/65079318/attributeerror-str-object-has-no-attribute-dim-in-pytorch. Do you have any idea about this or Bei's possible solution? Thanks a ton! — chen256, Dec 02 '20 at 06:42

score 3 · Accepted Answer · answered Dec 02 '20 at 07:09

3

As far as I remember - there was an old transformer version in colab. Something like 2.11.0. Try:

!pip install transformers~=2.11.0

Change the version number until it works.

answered Dec 02 '20 at 07:09

Andrey

5,932
3
17
35

Thank you so much for the answer, Andrey! Will definitely try it. Yesterday I tried several old versions including the version I used days ago, 3.5.0, but just got the same error message. Hopefully 2.11.0 will work! Thanks! – chen256 Dec 02 '20 at 15:26
Unbelievable! @Andrey, it works! You are terrific and my lifesaver! So much for me to learn and such a long way for me to go. Keep learning~ just found that my last version was 3.0.2 not 3.5.0. sorry for the confusion it may arise. – chen256 Dec 02 '20 at 15:45
1

@chen256 please accept the answer if you are satisfied – Andrey Dec 02 '20 at 16:17
I did it several times @Andrey. But every time it says "Thanks for the feedback! Votes cast by those with less than 15 reputation are recorded, but do not change the publicly displayed post score." A bit embarrassed. I should build up my reputation count soon. Will definitely come back . Sorry about that! – chen256 Dec 02 '20 at 16:22
Thank you @Andrey. I accepted the answer (Initially I did not see the difference between thumbs up and accept. Again sorry for my misunderstanding). In addition, in my last comment I meant I tried to thumbs up but encountered embarrassment. Thank U! Wow, just now I am able to thumbs up and rush to get it done! – chen256 Dec 02 '20 at 16:36

Codes worked fine one week ago, but keep getting error since yesterday: Fine-tuning Bert model training via PyTorch on Colab

1 Answers1