Questions tagged [gpt-2]

Use this tag with Generative Pre-trained Transformer 2 (GPT-2). Do not use with GPT-3 or the ad tagging library (GPT).

References

See the GPT-2 definition on Wikipedia.

Related Tags

199 questions
1
vote
1 answer

What is suffix and prefix prompt in openai Codex?

I have been trying to understand what is the suffix prompt in addition to the prefix prompt in Codex. They have provided an example def get_largest_prime_factor(n): if n < 2: return False def is_prime(n): > for i in range(2, n): > …
Exploring
  • 2,493
  • 11
  • 56
  • 97
1
vote
0 answers

How to force GPT2 to generate specific tokens in each sentence?

My input is a string and the outputs are vector representations (corresponding to the generated tokens). I'm trying to force the outputs to have specific tokens (e.g., 4 commas/2 of the word "to", etc). That is, each generated sentence must have…
1
vote
0 answers

Error when using mode.generate() from Transformers - TypeError: forward() got an unexpected keyword argument 'return_dict'

I am trying to perform inference with a finetuned GPT2HeadWithValueModel from the Transformers library. I'm using the model.generate() method from generation_utils.py I am using this function to call the generate() method: def top_p_sampling(text,…
1
vote
1 answer

CUDA out of memory while fine-tuning GPT2

RuntimeError: CUDA out of memory. Tried to allocate 144.00 MiB (GPU 0; 11.17 GiB total capacity; 10.49 GiB already allocated; 13.81 MiB free; 10.56 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting…
Vortekus
  • 69
  • 1
  • 9
1
vote
2 answers

AttributeError: 'GPT2Model' object has no attribute 'gradient_checkpointing'

I am trying to load a GPT2 fine tuned model in flask initially. The model is being loaded during the init functions using: app.modelgpt2 = torch.load('models/model_gpt2.pt', map_location=torch.device('cpu')) app.modelgpt2tokenizer =…
kewlzilla
  • 11
  • 4
1
vote
1 answer

Spacy-Transformers: Access GPT-2?

I'm using Spacy-Transformers to build some NLP models. The Spacy-Transformers docs say: spacy-transformers spaCy pipelines for pretrained BERT, XLNet and GPT-2 The sample code on that page shows: import spacy nlp =…
VikR
  • 4,818
  • 8
  • 51
  • 96
1
vote
1 answer

No module named 'tensorflow.contrib' even on tensorflow 1.13.2

I cant import a gpt_2_simple package due to an error ModuleNotFoundError: No module named 'tensorflow.contrib' I have installed python 3.7 and tried to install tensorflow 1.15.5, 1.15.2 and 1.13.2 and all of them were gaining this mistake. Im using…
LUTIY
  • 13
  • 2
1
vote
0 answers

fill-mask usage from transformers pipeline

I fine-tune a gpt2 language model and I am generation the text according to my model by using following lines of code: generator = pipeline('text-generation', tokenizer='gpt2', model='data/out') print(generator('Once upon a time',…
Naqi
  • 135
  • 12
1
vote
1 answer

Is there an 'untrained' gpt model folder?

Crazy question maybe: but I want to download the gpt-2 model framework but I want the weights to be initialized randomly. So as if the model still has to be finetuned on the reddit content (including json, vocab, meta & index files etc). Is this…
m.b
  • 45
  • 1
  • 4
1
vote
1 answer

Understanding how gpt-2 tokenizes the strings

Using tutorials here , I wrote the following codes: from transformers import GPT2Tokenizer, GPT2Model import torch tokenizer = GPT2Tokenizer.from_pretrained('gpt2') model = GPT2Model.from_pretrained('gpt2') inputs = tokenizer("Hello, my dog is…
Kadaj13
  • 1,423
  • 3
  • 17
  • 41
1
vote
0 answers

Huggingface GPT transformers layers output

I'm trying to use a GPT language model and get the weights it assigns to each word in the last state of text generation. My model is a GPT2 from the transformers library. Below is how I call the pretrained model: tokenizer =…
1
vote
1 answer

What is tokenizer.max len doing in this class definition?

I am following Rostylav's tutorial found here and am runnning into an error I dont quite understand: AttributeError Traceback (most recent call last) in () ----> 1 main(trn_df,…
1
vote
1 answer

Key difference between BERT and GPT2?

I read lots of articles and people are saying BERT is good for NLU while GPT is good for NLG. But the key difference in structure between them is just adding a mask or not in self-attention, and trained the model in different ways. From the code…
neese
  • 43
  • 1
  • 6
1
vote
1 answer

GPT2Simple having issues running

I am trying to run this GPT2Simple sample but I am getting errors Original stack trace for 'model/MatMul': File "c:/Users/Jerome Ariola/Desktop/Machine Learning Projects/gpt test.py", line 32, in steps=1) File "C:\Program…
Jerome Ariola
  • 135
  • 1
  • 11
1
vote
3 answers

Changes in GPT2/GPT3 model during few shot learning

During transfer learning, we take a pre-trained network and some observation pair (input and label), and use these data to fine-tune the weight by use of backpropagation. However, during one shot/few shot learning, according to this paper- 'Language…