0

I was doing some work where I wanted to generate 10000 sentences from the GptNeo Model. I have a GPU of size 40GB and am running the model in the GPU but everytime the code runs out of memory. Is there a limitation to the number of sentences that I can generate. Below is a small snippet of my code.

tokenizer = GPT2Tokenizer.from_pretrained(model)
model = GPTNeoForCausalLM.from_pretrained(model , pad_token_id = tokenizer.eos_token_id)
model.to(device)
input_ids = tokenizer.encode(sentence, return_tensors=‘pt’)
gen_tokens = model.generate(
input_ids,
do_sample=True,
top_k=50,
num_return_sequences=10000
)
prb977
  • 43
  • 5
  • While there is no theoretical limit, your GPU will probably struggle with ~100s sequences already. You will have to test out the limits yourself, unfortunately, as there is no guarantee on how much you will be able to fit on your specific GPU. – dennlinger Mar 23 '22 at 13:43
  • 1
    I think that seems to be the case. I tried for a bunch of sequences and it seems like the 300 is the limit for 40GB of GPU. – prb977 Mar 23 '22 at 23:58
  • Glad that you could figure out a way! Feel free to share more of your inisghts as a self-answer :) – dennlinger Mar 24 '22 at 11:57

0 Answers0