Highest Voted 'gpt-2' Questions

2

votes

0 answers

Training huggingface's GPT2 from scratch : how to implement causal mask?

I am trying to train huggingface's implementation of the GPT2 model from scratch (meaning I am using their architecture but not using pre-trained weights) but I noticed by looking into the code here…

asked Apr 01 '20 at 10:49

Johncowk

342
1
16

2

votes

2 answers

GPT-2 Continue training from checkpoint

I am trying to continue training from a saved checkpoint using the colab setup for GPT-2-simple at: https://colab.research.google.com/drive/1SvQne5O_7hSdmPvUXl5UzPeG5A6csvRA#scrollTo=aeXshJM-Cuaf But I just cant get it to work. Loading the saved…

python tensorflow nlp google-colaboratory gpt-2

asked Feb 06 '20 at 14:51

Tessmus

149
2
9

2

votes

1 answer

Tensorflow not fully utilizing GPU in GPT-2 program

I am running the GPT-2 code of the large model(774M). It is used for the generation of text samples through interactive_conditional_samples.py , link: here So I've given an input file containing prompts which are automatically selected to generate…

python tensorflow gpt-2

asked Dec 03 '19 at 05:34

amateur

21
2

2

votes

0 answers

Can the HuggingFace GPT2DoubleHeadsModel be used for non-multiple-choice next token prediction?

According to the HuggingFace Transformer's website (https://huggingface.co/transformers/model_doc/gpt2.html#gpt2doubleheadsmodel), GPT2DoubleHeadsModel (NOT GPT2LMHeadModel but GPT2DoubleHeadsModel) is the GPT-2 transformer model with a language…

nlp huggingface-transformers transformer-model gpt-2

asked Nov 22 '19 at 10:08

chico0913

577
4
10
22

1

vote

0 answers

Invalid key: 409862 is out of bounds for size 0

How I can fix this: I writed code for training GPT-2 on dataset by Hugging Face, but I have an error and don't know why I got this error: --------------------------------------------------------------------------- IndexError …

python huggingface-transformers huggingface gpt-2 huggingface-trainer

asked Jul 30 '23 at 04:06

Vovancho

11
2

1

vote

0 answers

GPT2 LLM fine-tuned model not generating expected answer

I am finetuning gpt2 model to answer questions with given faq.json. There is some issue with the answer generated by below code. I am assuming I have not done encoding/decoding of questions and answers correctly. Code - import torch from…

gpt-2 fine-tune llm

asked Jul 09 '23 at 11:39

tagg

383
4
7

1

vote

0 answers

Expected input batch_size (28) to match target batch_size (456), Changing batch size increase the target batch size with GPT2 model

I was practising fine-tuning a gpt2 model on a simple question-answer dataset when I encountered this error. I have studied other answers, but my input dataset shapes look fine. def tokenize_data(total_marks, coding_feeddback): inputs =…

dataframe deep-learning nlp transformer-model gpt-2

asked May 31 '23 at 18:56

Irfan Yaqub

402
4
14

1

vote

2 answers

Disable layers in GPT-2 model

I'm currently using a GPT-2 model that was trained on German texts. I would like to generate the next word in a text given a context chunk, but instead of using the whole model to predict the next word, I want each of the 12 layers to predict the…

python nlp gpt-2

asked Apr 25 '23 at 14:24

Merle

125
1
14

1

vote

0 answers

In which form should be dataset in NLP model?

I try to make fine-tuning of model tinkoff-ai/ruDialoGPT-medium. In which form should be my dataset? The base generation is in form: @@ПЕРВЫЙ@@ привет @@ВТОРОЙ@@ привет @@ПЕРВЫЙ@@ как дела? @@ВТОРОЙ@@ Where @@ПЕРВЫЙ@@ is the first person,…

machine-learning gpt-2

asked Apr 01 '23 at 12:22

Ubuty_programmist_7

308
7

1

vote

0 answers

Questions about masks of padding in GPT

The GPT series models use the decoder of Transformer, with unidirectional attention. In the source code of GPT in Hugging Face, there is the implementation of masked attention: self.register_buffer( "bias", …

huggingface-transformers attention-model gpt-2 zero-padding

asked Apr 01 '23 at 11:01

LocustNymph

11
3

1

vote

1 answer

Recovering input IDs from input embeddings using GPT-2

Suppose I have the following text aim = 'Hello world! you are a wonderful place to be in.' I want to use GPT2 to produce the input_ids and then produce the embedding and from embeddings recover the input_ids, to do this I do: from transformers…

python pytorch huggingface-transformers gpt-2

asked Feb 23 '23 at 16:41

Wiliam

1,078
10
21

1

vote

0 answers

Template for RLHF with the TRL library

I'm trying to implement a very very basic working template for RLHF with TRL. The notebook is here: https://www.kaggle.com/code/mcantoni81/rlhf-with-trl-gpt2 My target here is to make gpt2 answer "i'm the mailman", but maybe i'm not getting right…

pytorch huggingface-transformers kaggle huggingface gpt-2

asked Jan 21 '23 at 09:06

Michele Cantoni

21
2

1

vote

1 answer

When using OPT-2.7B or any other natural language model, is there a way to trick it into having a conversation/ give it a pre prompt in the code

Using this code, or a variant of, is there anything that can be added to "trick" opt into conversing as another user in a style more similar to a chatbot. As of now it will either start something more similar to an article or have a conversation…

neural-network huggingface-transformers language-model huggingface gpt-2

asked Dec 20 '22 at 21:30

Delta Adams

11
1

1

vote

2 answers

how to fine tune a GPT-2 model?

i'm using huggingface transformers package to load a pretrained GPT-2 model. I want to use GPT-2 for text generation, but the pretrained version isn't enough so I want to fine tune it with a bunch of personal text data. i'm not sure how I should…

python tensorflow dataset huggingface-transformers gpt-2

asked Dec 07 '22 at 05:55

ParmuTownley

957
2
14
34

1

vote

0 answers

NAN values appears when including a new padding token in my tokenizer

I'm trying to fine-tune a DialoGPT model on a new dataset. I already processed my data correctly and adding a new padding token in the tokenizer didn't seem to make any issue : #my dataset :…

python deep-learning huggingface-transformers language-model gpt-2

asked Aug 12 '22 at 14:05

Tessan

49
1
9

Questions tagged [gpt-2]

References

Related Tags

Training huggingface's GPT2 from scratch : how to implement causal mask?

GPT-2 Continue training from checkpoint

Tensorflow not fully utilizing GPU in GPT-2 program

Can the HuggingFace GPT2DoubleHeadsModel be used for non-multiple-choice next token prediction?

Invalid key: 409862 is out of bounds for size 0

GPT2 LLM fine-tuned model not generating expected answer

Expected input batch_size (28) to match target batch_size (456), Changing batch size increase the target batch size with GPT2 model

Disable layers in GPT-2 model

In which form should be dataset in NLP model?

Questions about masks of padding in GPT

Recovering input IDs from input embeddings using GPT-2

Template for RLHF with the TRL library

When using OPT-2.7B or any other natural language model, is there a way to trick it into having a conversation/ give it a pre prompt in the code

how to fine tune a GPT-2 model?

NAN values appears when including a new padding token in my tokenizer