Questions tagged [gpt-2]

Use this tag with Generative Pre-trained Transformer 2 (GPT-2). Do not use with GPT-3 or the ad tagging library (GPT).

References

See the GPT-2 definition on Wikipedia.

Related Tags

199 questions
0
votes
1 answer

How do I restart Hugging Face Transformer GPT2 finetuning?

I'm trying to restart fine-tuning but it starts from the beginning. Is this normal? I want to resume fine-tuning using a saved checkpoint. However, when I replace the line model = GPT2LMHeadModel.from_pretrained('gpt') with model =…
Blank256
  • 3
  • 1
  • 4
0
votes
0 answers

Finetuning gpt2, validation loss increases with accuracy and f1 score

I am finetuning gpt2 on text classification with the huggingface trainer. I observed that after 2 epochs, my validation loss start to increase, but my validation accuracy and f1 score still increases too. I have tried with 2 different seed but I…
0
votes
1 answer

Trying to finetune GPT-2 in Vertex AI but it just freezes

I've been following some tutorials on training GPT-2, and I've scraped together some code that works in Google Colab, but when I move it over to Vertex AI workbench, it just seems to sit there and do nothing when I run the training code. I have the…
0
votes
0 answers

Not enough memory for fine tuning LLM with Hugging Face

I'm running into runtime errors where I don't have enough memory to fine tune a pretrained LLM. I'm a novelist and I am curious to see what would happen if I fine tune a pretrained LLM to write more chapters of my novel in my style. I successfully…
0
votes
0 answers

how to read tokens from huggingface GPT2 tflite model using Interpreter

I have recently converted a pre trained gpt2 model to tflite and trying to use an interpreter for generating text from the prompt. Please find my code below which does the following: Converting the pre-trained model to tflite, works fine. Creating…
0
votes
1 answer

I want to make an AI text classifier using OpenAI API, based on GPT2 but i cannot find the API documentation for the GPT2

I wanted to create an AI text classifier project for my college, I wanted to use GPT2 API for the same as it is more reliable to catch the content generated by GPT 3.5, so how can I use GPT2 documentation? also any useful resources for the same are…
0
votes
1 answer

Getting an error 'no file named tf_model.h5 or pytorch_model.bin found in directory gpt2'

model_name = "gpt2" model = TFGPT2ForSequenceClassification.from_pretrained(model_name) tokenizer = GPT2Tokenizer.from_pretrained(model_name) tokenizer.add_special_tokens({'pad_token': '[PAD]'}) When I am running the above code, the model is…
0
votes
1 answer

Does openai GPT finetuning consider the prompt in the loss function?

OpenAI api includes a finetuning service that divides the task in "prompt" and "completion" https://platform.openai.com/docs/guides/fine-tuning The documentation says that the accuracy metrics are calculated respect to the completion. But for the…
arivero
  • 777
  • 1
  • 9
  • 30
0
votes
1 answer

How to train or fine-tune GPT-2 / GPT-J model for generative question answering?

I am new at using Huggingface models. Though I have some basic understanding of its Model, Tokenizers and Training. I am looking for a way to leverage the generative models like GPT-2, and GPT-J from the Huggingface community and tune them for the…
0
votes
1 answer

cannot import name 'GPT2ForQuestionAnswering' from 'transformers'

1 import pandas as pd 2 import torch ----> 3 from transformers import GPT2Tokenizer, GPT2ForQuestionAnswering, AdamW 4 from transformers import default_data_collator 5 from torch.utils.data import DataLoader import pandas as pd import torch from…
Lofiuiu
  • 31
  • 4
0
votes
0 answers

Train GPT-2 on custom data

I was looking for a way to train my own textual data using GPT-2 & I have found a blog post here: https://www.kaggle.com/code/ashiqabdulkhader/train-gpt-2-on-custom-language Everything works fine, the model building, dataset building, but it shows…
0
votes
0 answers

nanoGPT with custom dataset

I am trying to use nanoGPT from https://github.com/karpathy/nanoGPT on my custom input file. I have posted this issue on the repo itself ( at issue 172 ) but not getting any response there, hence lookin for some advice here on…
Sujata
  • 42
  • 7
0
votes
0 answers

Using GPT2 to find commonalities in text records

I have a dataset with many incidents and most of the data is in free text form. One row per incident and a text field of what happened. I tried to train a gpt2 model on the free text then try prompts such as "The person got burned because" and want…
user295944
  • 273
  • 4
  • 17
0
votes
0 answers

How to add encoder's last hidden state to GPT2 as encoder-decoder attention?

I have a BERT-based encoder model (encoder) and I want to input the last hidden state output of this to a GPT2-based model (decoder). There are no options in transformers.GPT2Config to use encoder's last hidden layer as input to GPT2. How do I…
0
votes
0 answers

Removing tokens from the GPT tokenizer

How can I remove unwanted sub-tokens from GPT vocabulary or tokenizer? I have tried an existing approach that was used for a ROBERTa kind of model as shown below (https://github.com/huggingface/transformers/issues/15032). However it fails at the…