Questions tagged [gpt-2]

Use this tag with Generative Pre-trained Transformer 2 (GPT-2). Do not use with GPT-3 or the ad tagging library (GPT).

References

See the GPT-2 definition on Wikipedia.

Related Tags

199 questions
2
votes
1 answer

How to refine a trained model in gpt2?

Im currently trying to work on text generation with my own text. I have trained my model with gpt2 with my own text. But it is giving random answers. For some questions it is giving me relevant answers. Is there a way to fine tune it further or can…
2
votes
1 answer

What is the cause of HFValidationError in this code and how do I resolve this error?

My python code in Chaquopy android studio Project: import torch as tc from transformers import GPT2Tokenizer, GPT2Model def generate_text(txt): """ Generate chat https://huggingface.co/gpt2 """ #Load Model files tokenizer…
baponkar
  • 272
  • 4
  • 15
2
votes
1 answer

How to save the gpt-2-simple model after training?

I trained the gpt-2-simple chat bot model but I am unable to save it. It's important for me to download the trained model from colab because otherwise I have to download the 355M model each time (see below code). I tried various methods to save the…
2
votes
2 answers

How to deal with stack expects each tensor to be equal size eror while fine tuning GPT-2 model?

I tried to fine tune a model with my personal information. So I can create a chat box where people can learn about me via chat gpt. However, I got the error of RuntimeError: stack expects each tensor to be equal size, but got [47] at entry 0 and…
2
votes
1 answer

How can I split my model among multiple GPUs?

I have been trying to split the self.blocks among multiple GPUs, but it returns the error "All tensors must be on same GPU." I don't want DataParallel, but ModelParallel among 2 GPU minimum and their weights and biases should commute with each…
smackiaa
  • 21
  • 1
2
votes
0 answers

PyTorch with Transformer - finetune GPT2 throws index out of range Error

in my Jupiter i have the following code. I can not figure out why this throws a IndexError: index out of range in self error. here ist the code: !pip install torch !pip install torchvision !pip install transformers import torch from…
Peter Shaw
  • 1,867
  • 1
  • 19
  • 32
2
votes
1 answer

How to resolve "the size of tensor a (1024) must match the size of tensor b" in happytransformer

I have the following code. This code uses the GPT-2 language model from the Transformers library to generate text from a given input text. The input text is split into smaller chunks of 1024 tokens, and then the GPT-2 model is used to generate text…
littleworth
  • 4,781
  • 6
  • 42
  • 76
2
votes
1 answer

Huggingface GPT2 loss understanding

(Also posted here https://discuss.huggingface.co/t/newbie-understanding-gpt2-loss/33590) I am getting stuck with understanding the GPT2 loss. I want to give the model the label having the target it will generate so that I can see that loss is…
Alex Punnen
  • 5,287
  • 3
  • 59
  • 71
2
votes
0 answers

How does Huggingface's tokenizers tokenize non-English characters?

I use tokenizers to tokenize natural language sentences into tokens. But came up with some questions: Here is some examples I tried using tokenizers: from transformers import GPT2TokenizerFast tokenizer =…
dongrixinyu
  • 172
  • 2
  • 14
2
votes
0 answers

How to deploy GPT-like model to Triton inference server?

The tutorials on deployment GPT-like models inference to Triton looks like: Preprocess our data as input_ids = tokenizer(text)["input_ids"] Feed input to Triton inference server and get outputs_ids = model(input_ids) Postprocess outputs…
2
votes
0 answers

How to train GPT2 with Huggingface trainer

I am trying to fine tune GPT2, with Huggingface's trainer class. from datasets import load_dataset import torch from torch.utils.data import Dataset, DataLoader from transformers import GPT2TokenizerFast, GPT2LMHeadModel, Trainer,…
2
votes
1 answer

GPT-J and GPT-Neo generate too long sentences

I trained a GPT-J and GPT-Neo models (fine tuning) on my texts and am trying to generate new text. But very often the sentences are very long (sometimes 300 characters each), although in the dataset the sentences are of normal length (50-100…
Astraport
  • 1,239
  • 4
  • 20
  • 40
2
votes
1 answer

How to remove input from from generated text in GPTNeo?

I'm writing a program to generate text... I need to remove the input from the generated text. How can I do this? The code: input_ids = tokenizer(context, return_tensors="pt").input_ids gen_tokens = model.generate( input_ids, do_sample=True, …
2
votes
3 answers

GPT-2: How do I speed up/optimize token text generation?

I am trying to generate a 20 token text using GPT-2 simple. It is taking me around 15 seconds to generate the sentence. AI Dungeon is taking around 4 seconds to generate the same size sentence. Is there a way to fasten/optimize the GPT-2 text…
Sap BH
  • 71
  • 1
  • 6
2
votes
1 answer

Hugging face - Efficient tokenization of unknown token in GPT2

I am trying to train a dialog system using GPT2. For tokenization, I am using the following configuration for adding the special tokens. from transformers import ( AdamW, AutoConfig, AutoTokenizer, PreTrainedModel, …
1 2
3
13 14