SentenceTransformer ('distilbert-base-nli-mean-tokens') is very slow

Asked Jul 03 '23 at 03:32

Active Jul 03 '23 at 03:32

Viewed 33 times

I am trying to learn the use of BERT. Here is the code:

from sklearn.datasets import fetch_20newsgroups
data = fetch_20newsgroups(subset='all')['data']

from sentence_transformers import SentenceTransformer
model = SentenceTransformer('distilbert-base-nli-mean-tokens')
embeddings = model.encode(data, show_progress_bar=True)

The problem is that it is incredibly slow: 24-48 hours to complete.

I have macOS M1 Pro notebook. What can be done to speed-up the process?

Thank you

asked Jul 03 '23 at 03:32

Toly

2,981
8
25
35

Maybe the problem is you are not using GPU, but CPU? – dankal444 Jul 03 '23 at 12:01
@dankal444 you mean adding: embeddings = model.encode(data, show_progress_bar=True, device='cuda') ? – Toly Jul 03 '23 at 14:10
I don't know specifics of this library and how to make it use GPU. Just saying your timings suggest that you are not using GPU. You can look at your CPU/GPU usage during encoding. – dankal444 Jul 03 '23 at 16:48

SentenceTransformer ('distilbert-base-nli-mean-tokens') is very slow

0 Answers0