Error of using parallelizing data processing by "sentence_transformers" on 2 GPUs from Jupyter notebook

Question

I would like to use sentence-transformer (https://www.sbert.net/) to encode some English sentences. In order to improve the efficiency, I am trying to run it on 2 T4 GPUs from Jupyter notebook on GCP (Linux Debian python 3.8). (The original question was posted on https://github.com/UKPLab/sentence-transformers/issues/2235 but no response).

from sentence_transformers import SentenceTransformer, LoggingHandler
import logging
    
    
logging.basicConfig(format='%(asctime)s - %(message)s',
                    datefmt='%Y-%m-%d %H:%M:%S',
                    level=logging.INFO,
                    handlers=[LoggingHandler()])
    
sentences = ["This is sentence {}".format(i) for i in range(10)]
    
#Define the model
model = SentenceTransformer('all-MiniLM-L6-v2', device='cuda')
    
#Start the multi-process pool on all available CUDA devices
pool = model.start_multi_process_pool(target_devices=['cuda:0', 'cuda:1']) 
    
#Compute the embeddings using the multi-process pool
emb = model.encode_multi_process(sentences, pool). # error -  Jupyter kernel restarting
    
print("Embeddings computed. Shape:", emb.shape, "type: ", type(emb))
print("Embeddings computed:", emb)

Output:

- Load pretrained SentenceTransformer: all-MiniLM-L6-v2
- Start multi-process pool on devices: cuda:0, cuda:1

Then, I got error:

Kernel RestartingThe kernel for my_notebook.ipynb appears to have died. It will restart automatically.

Could anybody let me know if I missed anything ?

============== UPDATE ===========

TypeError                                 Traceback (most recent call last)
Cell In[13], line 15
     12 model = SentenceTransformer('all-MiniLM-L6-v2')
     14 # Move the model to the first device
---> 15 model = model.to(devices[0])
     17 # Wrap the model with DataParallel to utilize multiple GPUs
     18 model = torch.nn.DataParallel(model, device_ids=[device.index for device in devices])

File /usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py:1126, in Module.to(self, *args, **kwargs)
   1039 def to(self, *args, **kwargs):
   1040     r"""Moves and/or casts the parameters and buffers.
   1041 
   1042     This can be called as
   (...)
   1123 
   1124     """
-> 1126     device, dtype, non_blocking, convert_to_format = torch._C._nn._parse_to(*args, **kwargs)
   1128     if dtype is not None:
   1129         if not (dtype.is_floating_point or dtype.is_complex):

TypeError: to() received an invalid combination of arguments - got (device), but expected one of:
 * (torch.device device, torch.dtype dtype, bool non_blocking, bool copy, *, torch.memory_format memory_format)
 * (torch.dtype dtype, bool non_blocking, bool copy, *, torch.memory_format memory_format)
 * (Tensor tensor, bool non_blocking, bool copy, *, torch.memory_format memory_format)

score 0 · Answer 1 · answered Jun 17 '23 at 15:53

The error you encountered might be due to memory constraints when running the code on multiple GPUs with large models. To address this issue, you can try reducing the batch size or using smaller models. Additionally, you can also try using a limited number of sentences for testing purposes. import torch from torch import cuda from sentence_transformers import SentenceTransformer

# Check the number of available GPUs
num_gpus = torch.cuda.device_count()

# Specify the devices to be used (cuda:0, cuda:1, ...)
devices = [cuda.device(f'cuda:{i}') for i in range(num_gpus)]

# Initialize the model
model = SentenceTransformer('all-MiniLM-L6-v2')

# Move the model to the first device
model = model.to(devices[0])

# Wrap the model with DataParallel to utilize multiple GPUs
model = torch.nn.DataParallel(model, device_ids=[device. Index for device in 
devices])

# Encode sentences using the model
sentences = ["This is sentence {}".format(i) for i in range(10)]

with torch.no_grad():
    embeddings = []

    # Define the batch size
    batch_size = 2

    # Iterate over the sentences in batches
    for i in range(0, len(sentences), batch_size):
        # Move the batch of sentences to the first device
        input_sentences = [torch.tensor(sentence).to(devices[0]) for sentence in sentences[i:i+batch_size]]

        # Encode the batch of sentences using the model
        batch_embeddings = model(input_sentences)

        # Move the embeddings back to CPU
        batch_embeddings = batch_embeddings.cpu()

        # Collect the embeddings
        embeddings. Append(batch_embeddings)
#Concatenate the embeddings from all devices and batches
embeddings = torch.cat(embeddings, dim=0)

print("Embeddings computed. Shape:", embeddings. Shape, "type:", type(embeddings))
print("Embeddings computed:", embeddings)

thanks for your help. I got a new error with the code. – mtnt Jun 20 '23 at 23:21 — mtnt, Jun 20 '23 at 23:21

Error of using parallelizing data processing by "sentence_transformers" on 2 GPUs from Jupyter notebook

1 Answers1

Linked