Questions tagged [gpt-2]

Use this tag with Generative Pre-trained Transformer 2 (GPT-2). Do not use with GPT-3 or the ad tagging library (GPT).

References

See the GPT-2 definition on Wikipedia.

Related Tags

199 questions
0
votes
1 answer

IndexError: index out of range in self while using GPT2LMHeadModel.from_pretrained("gpt2")

I am working on this question answering code and using pretrained GPT2LMHeadModel. But after tokenization when I pass the inputs and attention mask to the model it is giving index error. My code: feedback_dataset = [] #…
0
votes
0 answers

TypeError: TextEncodeInput must be Union[TextInputSequence, Tuple[InputSequence, InputSequence]] when training GPT-2

I'm learning how to train generative models using the transformers library from HuggingFace, however, I keep having this error: TypeError: TextEncodeInput must be Union[TextInputSequence, Tuple[InputSequence, InputSequence]] when I want to tokenize…
Erika
  • 151
  • 3
  • 12
0
votes
1 answer

My gpt2 code generates a few correct words and then goes into a loop of generating the same sequence again and again

The following gpt2 code for sentence completion generates a few good sentences and then ends in a loop of repetitive sentences. from transformers import GPT2LMHeadModel, GPT2Tokenizer import torch …
steve landiss
  • 1,833
  • 3
  • 19
  • 30
0
votes
0 answers

Getting an error while trying to train my model in train_function (Empty Logs)

I am trying to train a GPT2 model on the wikipedia text. While doing so I get the following error: ValueError: Unexpected result of `train_function` (Empty logs). Please use `Model.compile(..., run_eagerly=True)`, or…
0
votes
0 answers

GPT-2: Setting biases as -1 billion

I'm currently trying to predict upcoming words given an input text chunk, but I want to "mask" the last n words of the input text by setting the attention weights to 0 (or something very small). This is what I tried to do: I tried modifying the…
Merle
  • 125
  • 1
  • 14
0
votes
0 answers

model.bert() through the slicing error can anyone let me know why is it?

with torch.no_grad(): logits = torch.zeros(len(definitions), dtype=torch.double).to(DEVICE) for i, bert_input in list(enumerate(features)): logits[i] = model.ranking_linear( model.bert( …
0
votes
0 answers

While trying to generate text using GPT-2 the custom loss function accesses PAD_TOKEN_ID

While training the custom loss function tries to access the PAD_TOKEN_ID resulting in the below error.50257 is the PAD_TOKEN_ID and the vocab size of GPT-2 InvalidArgumentError: {{function_node…
S_2010
  • 1
  • 1
0
votes
1 answer

Kaggle Code doesn't download "gpt2" language model

I am using kaggle code to download gpt2 language model. from transformers import AutoTokenizer, AutoModelForCausalLM device = "cuda" if torch.cuda.is_available() else "cpu" model_name = "gpt2-xl" tokenizer =…
agongji
  • 117
  • 1
  • 7
0
votes
1 answer

GPT2 special tokens: Ignore word(s) in input text when predicting next word

I just started using GPT2 and I have a question concerning special tokens: I'd like to predict the next word given a text input, but I want to mask some words in my input chunk using a special token. I don't want GPT2 to predict the masked words, I…
Merle
  • 125
  • 1
  • 14
0
votes
0 answers

how to fix "KeyError: 0" in the hugging face transformer train() function

hello guys please i am in dying need of your help . i am trying to fine-tune the gpt2-meduim model with the hugging face transformer and i ran into this error just when i wanted to start the training "KeyError: 0" . here is my full code import…
0
votes
1 answer

Huggingface Transformers (PyTorch) - Custom training loop doubles speed?

I've found something quite strange when using Huggingface Transformers with a custom training loop in PyTorch. But first, some context: I'm currently trying to fine tune a pretrained GPT2 small (GPT2LMHeadModel; the ~170M param version) on multiple…
0
votes
1 answer

I am getting error here torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) when I call trainer.train() function of GPT2 model

I am new to NLP and I was trying gpt2 to train on my own data. from transformers import GPT2Config, GPT2LMHeadModel, GPT2Tokenizer, TextDataset, DataCollatorForLanguageModeling, Trainer, TrainingArguments config = GPT2Config(vocab_size=10000,…
0
votes
0 answers

Getting error "Container localhost does not exist. (Could not find resource: localhost/model/wpe)" while generating with gpt2-simple

I am trying to generate text using the GPT-2 language model using the gpt2-simple library. The training process worked fine, but I am running into an error when I try to generate text using the generate() function. The error message I am receiving…
Kit
  • 1
  • 1
0
votes
0 answers

Time and cost to train Distill GPT-2 model on BookCorpus using AWS EC2

I am trying to calculate the time it would take to train a Distill GPT2 model on BookCorpus dataset using multiple EC2 instances for the purpose of language modeling. What is the method for calculating training time of language models?
Troi
  • 43
  • 2
0
votes
1 answer

How to fine-tune gpt2 with a custom set of unlabelled document

I'm newbie to GPT2 fine-tuning. My goal is to fine-tune GPT-2 (or BERT) on a my own set of document, in order to be able to query the bot on a topic contained in these documents, and receive an answer. I have some doubts on how to develop this,…