Highest Voted 'gpt-2' Questions

3

votes

3 answers

How does GPT-like transformers utilize only the decoder to do sequence generation?

I want to code a GPT-like transformer for a specific text generation task. GPT-like models use only the decoder block (in stacks) [1]. I know how to code all sub-modules of the decoder block shown below (from the embedding to the softmax layer) in…

asked Mar 08 '23 at 12:04

mac179

1,540
1
14
24

3

votes

0 answers

"RuntimeError: Expected target size" error for the nn.CrossEntropyLoss() function

I am trying to train a GPT-2 model to take in a tokenized/padded input and predict the output. My batch size is 32. My max length is 343. I believe that the 768 comes from the model. I cannot get the loss function to work properly though. The…

machine-learning pytorch tensor cross-entropy gpt-2

asked Jan 02 '23 at 04:03

C_Dog

31
2

3

votes

1 answer

Text generation AI models generating repeated/duplicate text/sentences. What am I doing incorrectly? Hugging face models - Meta GALACTICA

Whole day I have worked with available text generation models Here you can find list of them : https://huggingface.co/models?pipeline_tag=text-generation&sort=downloads I want to generate longer text outputs, however, with multiple different models,…

python nlp huggingface-transformers huggingface gpt-2

asked Nov 19 '22 at 20:48

Furkan Gözükara

22,964
77
205
342

3

votes

1 answer

Getting logits from T5 Hugging Face model using forward() method without labels

For my use case, I need to obtain the logits from T5's forward() method without inputting labels. I know that forward() and .generate() are different (see here). I have also seen this post in which the logits were obtained but labels had to be…

nlp huggingface-transformers bert-language-model gpt-2

asked Aug 19 '22 at 02:15

muhleeshe

61
4

3

votes

1 answer

Should I adjust the weights of embedding of newly added tokens?

I'm a beginner of neural language processing. Recenttly, I try to train a text generation model based on GPT-2 with huggingface transformers. I added some new tokens to the tokenizer and resize the embedding of the model with…

huggingface-transformers pre-trained-model gpt-2

asked Dec 16 '21 at 03:34

butyuhao

31
1

3

votes

1 answer

How to early-stop autoregressive model with a list of stop words?

I am using GPT-Neo model from transformers to generate text. Because the prompt I use starts with '{', so I would like to stop the sentence once the paring '}' is generated. I found that there is a StoppingCriteria method in the source code but…

python huggingface-transformers autoregressive-models gpt-2

asked Oct 01 '21 at 09:30

Lei Zhang

500
4
15

3

votes

0 answers

Incrementally training || pause&resume training, GPT2 language model'ing

I'm currently trying to learn python - and at the same time learning machine learning with GPT-2 language modeling - i have had some problems, and i got over most of them, and finally got something decent running. But... as most of you probably…

python tensorflow tensorflow2.0 huggingface-transformers gpt-2

asked Jan 30 '21 at 23:33

Magnus V.

43
4

3

votes

2 answers

Huggingface Transformer - GPT2 resume training from saved checkpoint

Resuming the GPT2 finetuning, implemented from run_clm.py Does GPT2 huggingface has a parameter to resume the training from the saved checkpoint, instead training again from the beginning? Suppose the python notebook crashes while training, the…

python pytorch huggingface-transformers language-model gpt-2

asked Jan 01 '21 at 11:07

Woody

930
9
23

3

votes

1 answer

What memory does Transformer Decoder Only use?

I've been reading a lot about transformers and self attention and have seen both BERT and GPT-2 are a newer version that only use an encoder transformer (BERT) and decoder transformer (GPT-2). I've been trying to build a decoder only model for…

python pytorch decoder transformer-model gpt-2

asked Dec 17 '20 at 13:08

bellerb

137
8

3

votes

1 answer

How many characters can be input into the "prompt" for GPT-2

I'm using the OpenAI GPT-2 model from github I think that the top_k parameter dictates how many tokens are sampled. Is this also the parameter that dictates how large of a prompt can be given? If top_k = 40, how large can the prompt be?

python nlp openai-api gpt-2

asked Aug 12 '20 at 16:09

Hanley Soilsmith

579
2
9
27

3

votes

2 answers

Tensorflow has no Attribute "sort" in GPT 2 Git Release?

I downloaded the git repo (https://github.com/openai/gpt-2) and followed the python3 instructions (in DEVELOPERS.MD) for installation on my Kubuntu 18.04LTS box, but I cannot run it and instead get an error. Here is what I've done so far: pip3…

tensorflow gpt-2

asked Apr 29 '20 at 20:14

Sarah Szabo

10,345
9
37
60

3

votes

1 answer

Is there a GPT-2 implementation that allows me to fine-tune and prompt for text completion?

I wish to to fine-tune a GPT-2 implementation on some text data. I then want to use this model to complete a text prompt. I can do the first part easily enough using Max Woolf's gpt-2-simple implementation. And Neil Shepherd's fork of OpenAI allows…

python-3.x deep-learning nlp openai-gym gpt-2

asked Jan 28 '20 at 08:13

Lodore66

1,125
4
16
34

3

votes

5 answers

Can't import the encoder code for fine tuning GPT-2

I'm trying to reproduce the example from this article: https://medium.com/@ngwaifoong92/beginners-guide-to-retrain-gpt-2-117m-to-generate-custom-text-content-8bb5363d8b7f The example code is from the following repo:…

python path nlp init gpt-2

asked Aug 30 '19 at 05:30

Luis Ramon Ramirez Rodriguez

9,591
27
102
181

3

votes

0 answers

Using GPT-2 with your own dictionary of words

I'm training the gpt-2 with custom encodings and custom vocab.bpe file. However, when I generate text using gpt-2, the output tokens have range that exceeds the range of my new encodings. How can I make gpt-2 work for me then?

python-3.x nlp gpt-2

asked Jun 26 '19 at 10:37

zengod

1,114
13
26

3

votes

2 answers

Fine tune GPT-2 Text Prediction for Conversational AI

I am experimenting with the gpt-2 model's conditional text generation to tweak it for a good chatbot. I am using nsheppard's code for retraining it on my custom dataset. I trained my model on a custom dataset of conversations that I pulled from my…

python tensorflow nlp chatbot gpt-2

asked Jun 02 '19 at 12:57

Bhavesh Laddagiri

365
2
5
12

Questions tagged [gpt-2]

References

Related Tags

How does GPT-like transformers utilize only the decoder to do sequence generation?

"RuntimeError: Expected target size" error for the nn.CrossEntropyLoss() function

Text generation AI models generating repeated/duplicate text/sentences. What am I doing incorrectly? Hugging face models - Meta GALACTICA

Getting logits from T5 Hugging Face model using forward() method without labels

Should I adjust the weights of embedding of newly added tokens?

How to early-stop autoregressive model with a list of stop words?

Incrementally training || pause&resume training, GPT2 language model'ing

Huggingface Transformer - GPT2 resume training from saved checkpoint

What memory does Transformer Decoder Only use?

How many characters can be input into the "prompt" for GPT-2

Tensorflow has no Attribute "sort" in GPT 2 Git Release?

Is there a GPT-2 implementation that allows me to fine-tune and prompt for text completion?

Can't import the encoder code for fine tuning GPT-2

Using GPT-2 with your own dictionary of words

Fine tune GPT-2 Text Prediction for Conversational AI