Highest Voted 'gpt-2' Questions

0

votes

0 answers

Why do generating text with gpt2 keep increasing memory consumption?

I have a python script running an infinite loop, calling gpt2.generate, running on CPU (not GPU). After the model is loaded and the first spike of memory usage is over, the RAM consumption keep increasing by about 100Mo every minute. There is…

python tensorflow gpt-2

asked Sep 19 '22 at 21:26

Bite code

578,959
113
301
329

0

votes

1 answer

GPT2 paper clarification

In the GPT-2 paper, under Section 2, Page 3 it says, Since the supervised objective is the the same as the unsupervised objective but only evaluated on a subset of the sequence, the global minimum of the unsupervised objective is also the global…

gpt-2

asked Jun 10 '22 at 22:16

Albin

36
3

0

votes

1 answer

OOM while fine-tuning medium sized model with DialoGPT on colab

I am trying to finetune DialoGPT with a medium-sized model, I am getting Cuda error while the training phase, I reduced the batch size from 4, but still, the error persists. My parameters are #self.output_dir = 'output-small' …

google-colaboratory huggingface-transformers language-model gpt-2

asked Jun 01 '22 at 20:20

Sap BH

71
1
6

0

votes

0 answers

How can I respond to a CLI prompt in Kaggle?

I'm using Kaggle to generate poetry samples with GPT-2. My notebook uses datasets from Gwern's poetry generator and uses nshepperd's GPT-2 model. This all works fine with my notebook when generating unconditional samples. !python…

python jupyter-notebook command-line-interface kaggle gpt-2

asked Apr 28 '22 at 16:58

theo

65
2
10

0

votes

0 answers

_forward_unimplemented() got an unexpected keyword argument 'input_ids'

I am training a model using HuggingFace Trainer class.(GPT2 text Classification) The following code does a decent job: def preprocess_function(examples): return tokenizer(examples["text"], truncation=True ,max_length=MAXLEN, …

pytorch huggingface-transformers huggingface-tokenizers gpt-2

asked Mar 30 '22 at 19:25

eatalot foryou

1
2

0

votes

0 answers

Generating 10000 sentences from GptNeo Model results in out of memory error

I was doing some work where I wanted to generate 10000 sentences from the GptNeo Model. I have a GPU of size 40GB and am running the model in the GPU but everytime the code runs out of memory. Is there a limitation to the number of sentences that I…

nlp huggingface-transformers gpt-2

asked Mar 23 '22 at 01:47

prb977

43
5

0

votes

1 answer

How to save checkpoints for thie transformer gpt2 to continue training?

I am retraining the GPT2 language model, and am following this blog : https://towardsdatascience.com/train-gpt-2-in-your-own-language-fc6ad4d60171 Here, they have trained a network on GPT2, and I am trying to recreate a same. However, my dataset is…

tensorflow nlp gpt-2

asked Feb 22 '22 at 04:36

Vivek

124
14

0

votes

1 answer

GPT-2 pretrained model fails to load when TF v2 behaviour is disabled

I am trying to use GPT-2 in a codebase that is written for Tensorflow 1.x. However, I am running the code against TF 2.x installation binaries with tf.disable_v2_behavior() flag. Without this tf.disable_v2_behavior() flag, GPT-2 pretrained model…

python tensorflow huggingface-transformers gpt-2

asked Feb 15 '22 at 16:31

Mohammad Rifat Arefin

399
3
10

0

votes

2 answers

"ValueError: You have to specify either input_ids or inputs_embeds" when training AutoModelWithLMHead Model (GPT-2)

I want to fine-tune the AutoModelWithLMHead model from this repository, which is a German GPT-2 model. I have followed the tutorials for pre-processing and fine-tuning. I have prepocessed a bunch of text passages for the fine-tuning, but when…

python pytorch huggingface-transformers gpt-2

asked Jan 04 '22 at 10:26

Stimmot

999
1
7
22

0

votes

0 answers

Fine-tune dialoGPT with a new dataset - loss below 1 and perplexity exploded

I am following the tutorial https://github.com/ncoop57/i-am-a-nerd/blob/master/_notebooks/2020-05-12-chatbot-part-1.ipynb on fine-tuning DialoGPT (GPT-2) with a new conversational dataset. It was trained fine earlier, the perplexity was about 5, 6…

deep-learning nlp gpt-2

asked Dec 26 '21 at 02:52

vicmerbia

1
2

0

votes

2 answers

What happens if optimal training loss is too high

I am training a Transformer. In many of my setups I obtain validation and training loss that look like this: Then, I understand that I should stop training at around epoch 1. But then the training loss is very high. Is this a problem? Does the…

pytorch huggingface-transformers transformer-model gpt-2 trainingloss

asked Dec 25 '21 at 20:22

katze

43
7

0

votes

1 answer

Trouble getting text from GPT2 returned?

basically I am trying to have gpt2 respond to a prompt in the variable {text} and I am running into this error: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() here is my code thus far: import…

python huggingface-transformers gpt-2

asked Nov 25 '21 at 22:59

user13800925

0

votes

1 answer

implement do_sampling for custom GPT-NEO model

import numpy as np from transformers import GPTNeoForCausalLM, GPT2Tokenizer import coremltools as ct tokenizer = GPT2Tokenizer.from_pretrained("gpt2") sentence_fragment = "The Oceans are" class NEO(torch.nn.Module): def __init__(self,…

python nlp torch huggingface-transformers gpt-2

asked Nov 08 '21 at 20:11

Olexander Korenyuk

135
3
13

0

votes

0 answers

Cudnn won't work when I install cudnn64_8.dll

So I'm currently working with GPT2 running on Tensorflow for text generation. I'm working with this repo specifically. I recently decided to install CUDA and cudnn to improve GPU capability and installed it via these instructions. I'm currently…

python tensorflow cudnn gpt-2

asked Oct 17 '21 at 07:06

Alditrus

87
1
5

0

votes

1 answer

GPT 2 - TypeError: Cannot cast array data from dtype('O') to dtype('int64') according to the rule 'safe'

I am working with gpt2, python 3.9 and tensorflow 2.5 and when connecting to flask (flask run in terminal) I get a following message: TypeError: Cannot cast array data from dtype('O') to dtype('int64') according to the rule 'safe' Here is the code…

python gpt-2

asked Sep 14 '21 at 18:08

DK26

103
8

Questions tagged [gpt-2]

References

Related Tags

Why do generating text with gpt2 keep increasing memory consumption?

GPT2 paper clarification

OOM while fine-tuning medium sized model with DialoGPT on colab

How can I respond to a CLI prompt in Kaggle?

_forward_unimplemented() got an unexpected keyword argument 'input_ids'

Generating 10000 sentences from GptNeo Model results in out of memory error

How to save checkpoints for thie transformer gpt2 to continue training?

GPT-2 pretrained model fails to load when TF v2 behaviour is disabled

"ValueError: You have to specify either input_ids or inputs_embeds" when training AutoModelWithLMHead Model (GPT-2)

Fine-tune dialoGPT with a new dataset - loss below 1 and perplexity exploded

What happens if optimal training loss is too high

Trouble getting text from GPT2 returned?

implement do_sampling for custom GPT-NEO model

Cudnn won't work when I install cudnn64_8.dll

GPT 2 - TypeError: Cannot cast array data from dtype('O') to dtype('int64') according to the rule 'safe'