0

I need to get the last layer of embeddings from a BERT model using HuggingFace. The following code works, however is extremely slow, how can I increase the speed?

This is a toy example, my real data is made of thousands of examples with long texts.

import transformers
import pandas as pd
from transformers import BertModel, BertTokenizer

tokenizer = BertTokenizer.from_pretrained("bert-base-uncased") 
model = BertModel.from_pretrained("bert-base-uncased")

def getPrediction(text): 
  encoded_input = tokenizer(text, return_tensors='pt')
  outputs = model(**encoded_input)
  embedding = outputs[0][:, -1]  
  embedding_1ist = embedding.cpu().detach().tolist()[0]
  return embedding_1ist

df = pd.DataFrame({'text':['First text', 'Second text']})
results =  pd.DataFrame(df.apply(lambda x: getPrediction(x.text), axis=1)) 
Ushuaia81
  • 495
  • 1
  • 6
  • 14
  • 3
    It is not the tokenizer, the model is slow. BERT is a big model. You can use a GPU to speed up computation. You can speed up the tokenization by passing `use_fast=True` to the `from_pretrained` call of the tokenizer. This will load the rust-based tokenizers, which are much faster. But I think the problem is not tokenization. – amdex Nov 27 '20 at 07:47
  • @amdex is it possible you show me a minimal example of how to implement this particular example using a GPU. – Ushuaia81 Nov 28 '20 at 02:20
  • @Alfredo_MF, do you have GPU device? If so, you can refer this: https://stackoverflow.com/questions/54216920/how-to-use-multiple-gpus-in-pytorch/64825728#64825728 – Ashwin Geet D'Sa Nov 29 '20 at 22:53
  • @AshwinGeetD'Sa I have a GPU. I will follow the link that you provided. – Ushuaia81 Nov 30 '20 at 01:39

0 Answers0