Questions tagged [gpt-2]

Use this tag with Generative Pre-trained Transformer 2 (GPT-2). Do not use with GPT-3 or the ad tagging library (GPT).

References

See the GPT-2 definition on Wikipedia.

Related Tags

199 questions
2
votes
0 answers

Getting MemoryError fine-tuning GPT2(355M) model with small datasets (3MB) through aitextgen

I'm using aitextgen to fine-tune the 355M GPT-2 model using the train function. The datasets are small txt files consisting of lines like these (these are encoded texts for keyword-based text generation, hence the…
Cephylist
  • 21
  • 2
2
votes
0 answers

GPT2 on apple M1 Pro chip

while trying to install GPT2 according to the instructions on the official github repo, I ended up with an Illigal hardware instruction error when I tried to use it. that means I shouldn't even think of trying GPT2 on an M1 pro chip (though the…
user16510763
2
votes
1 answer

How to get onnx format from pretrained GPT2 models?

I'm trying to transform KoGPT2 model, which is pretrained by GPT2, to onnx format in order to change the model to tensorflow format. I used convert_graph_to_onnx in transformers but it didn't work because of some reasons. I don't know what this…
Sooyong
  • 21
  • 1
2
votes
1 answer

How to increase batch size in GPT2 training for translation task?

I am developing a code to use the pre-trained GPT2 model for a machine translation task. The length of my data's word-to-id is 91, and I developed the following code for my model: import torch from torch.utils.data import DataLoader from…
K.N
  • 871
  • 2
  • 10
  • 30
2
votes
1 answer

Mismatched tensor size error when generating text with beam_search (huggingface library)

I'm using the huggingface library to generate text using the pre-trained distilgpt2 model. In particular, I am making use of the beam_search function, as I would like to include a LogitsProcessorList (which you can't use with the generate…
oregano
  • 816
  • 9
  • 25
2
votes
2 answers

AttributeError: 'GPT2TokenizerFast' object has no attribute 'max_len'

I am just using the huggingface transformer library and get the following message when running run_lm_finetuning.py: AttributeError: 'GPT2TokenizerFast' object has no attribute 'max_len'. Anyone else with this problem or an idea how to fix it?…
2
votes
1 answer

Flask app serving GPT2 on Google Cloud Run not persisting downloaded files?

I have a Flask app running on Google Cloud Run, which needs to download a large model (GPT-2 from huggingface). This takes a while to download, so I am trying to set up so that it only downloads on deployment and then just serves this up for…
L Xandor
  • 1,659
  • 4
  • 24
  • 48
2
votes
2 answers

Modifying the Learning Rate in the middle of the Model Training in Deep Learning

Below is the code to configure TrainingArguments consumed from the HuggingFace transformers library to finetune the GPT2 language model. training_args = TrainingArguments( output_dir="./gpt2-language-model", #The output directory …
2
votes
1 answer

How to use GPT-2 for topic modelling?

I want to generate topics and subtopics from a corpus. It would be great if someone could share the python code.
2
votes
1 answer

How to Get Rid of GPT-2 Warning Message?

Every time I run GPT-2, I am receiving this message. Is there a way I can get this to go away? Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at gpt2 and are newly initialized: ['h.0.attn.masked_bias',…
Johnny
  • 125
  • 9
2
votes
4 answers

How can I find the probability of a sentence using GPT-2?

I'm trying to write a program that, given a list of sentences, returns the most probable one. I want to use GPT-2, but I am quite new to using it (as in I don't really know how to do it). I'm planning on finding the probability of a word given the…
Elan SK
  • 117
  • 2
  • 11
2
votes
3 answers

How to alter gpt-2 code to work with Tensorflow 2.0?

I am trying to use gpt-2 for text generation. I get compatibility errors, even after running the Tensorflow 2.0 code upgrade script. Steps I've followed: Clone repo From here on out, follow the directions in DEVELOPERS.md Run upgrade script on…
Nick
  • 41
  • 1
  • 7
2
votes
1 answer

Cannot convert GPT-2 model using Tensorflow.JS

I'm trying to load a GPT-2 model on a Node.JS project. I believe this could be done using tfjs library. So I tried to convert the GPT-2 model to tfjs model. Following recommendations on this answer, I exported the GPT-2 model as SavedModel. !python3…
Mohamed Taher Alrefaie
  • 15,698
  • 9
  • 48
  • 66
2
votes
1 answer

Is gpt-2 unusable with python?

I was following this tutorial and ran across an issue while using train.py. the issue says Exception has occurred: ModuleNotFoundError No module named 'tensorflow.contrib' File "F:\PythonFiles\Post Generator\gpt-2\src\model.py", line 3, in…
2
votes
0 answers

Adding tokens to GPT-2 BPE tokenizer

I want to add new words to my BPE tokenizer. I know the symbol Ġ means the end of a new token and the majority of tokens in vocabs of pre-trained tokenizers start with Ġ. Assume I want to add the word Salah to my tokenizer. I tried to add both Salah…
Akim
  • 139
  • 6