Questions tagged [gpt-2]

Use this tag with Generative Pre-trained Transformer 2 (GPT-2). Do not use with GPT-3 or the ad tagging library (GPT).

References

See the GPT-2 definition on Wikipedia.

Related Tags

199 questions
0
votes
1 answer

BCELoss between logits and labels not working

I am using a GPT2 model that outputs logits (before softmax) in the shape (batch_size, num_input_ids, vocab_size) and I need to compare it with the labels that are of shape (batch_size, num_input_ids) to calculate BCELoss. How do I calculate…
MNK
  • 634
  • 4
  • 18
0
votes
0 answers

How to work with JSON lines GPT-2 database?

I downloaded all files. And all of them are just a randomly answers in JSON format. So, I want to train my own tensorflow.js model using this database! But, I don't have a question database here. So, what I need to do? I want to train my model to…
0
votes
1 answer

tokenizer.save_pretrained TypeError: Object of type property is not JSON serializable

I am trying to save the GPT2 tokenizer as follows: from transformers import GPT2Tokenizer, GPT2LMHeadModel tokenizer = GPT2Tokenizer.from_pretrained("gpt2") tokenizer.pad_token = GPT2Tokenizer.eos_token dataset_file = "x.csv" df =…
AKMalkadi
  • 782
  • 1
  • 5
  • 18
0
votes
0 answers

fine tuning GPT2 on Colab gives error: Your session crashed after using all available RAM

I'm new to ml, and trying to create a ml model fine tuning GPT2. I get the dataset and preprocessed it (file_name). But when I actually try to run below code, fine tuning GPT2, Colab always say 'Your session crashed after using all available…
Seungjun
  • 874
  • 9
  • 21
0
votes
1 answer

Why does GPT-2 vocab contain weird words?

I was looking at the vocabulary of GPT-2. https://huggingface.co/gpt2/blob/main/vocab.json I found to my surprise very weird tokens that I did not expect. For example, it contains the token (index…
Daniel
  • 2,331
  • 3
  • 26
  • 35
0
votes
2 answers

No space left on device error when trying to load GPT2 model

I am trying to run an experiment with GPT2; i.e., I use model = GPT2Model.from_pretrained('gpt2-xl') The error I get is a traceback which leads to OSError: [Errno 28] No space left on device:…
nlp4892
  • 61
  • 7
0
votes
0 answers

TFGPT2LMHeadModel to TFLite changes the input and output shape

The TFGPT2LMHeadModel convertion to TFlite renders unexpected input and output shape as oppoed to the pre trained model gpt2-64.tflite , how can we fix the same ? !wget https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-64.tflite import numpy…
0
votes
0 answers

ML.Net : How to define Shape Vectors for GPT2 ONNX model?

I'm trying to use ML.Net to consume an ONNX GPT-2 Model. https://github.com/onnx/models/blob/main/text/machine_comprehension/gpt-2/README.md I'm stuck on defining the shape dicionary. The following are the Input and Output model properties,…
Murilo Maciel Curti
  • 2,677
  • 1
  • 21
  • 26
0
votes
1 answer

How to replace the tokenize() and pad_sequence() functions from transformers?

I got the following imports: import torch, csv, transformers, random import torch.nn as nn from torch.utils.data import Dataset import torch.optim as optim import pandas as pd from transformers import GPT2Tokenizer, GPT2LMHeadModel, tokenize,…
0
votes
0 answers

How to train GPT2 with Tensorflow

I'm trying to train gpt2 model with custom dataset, but it fails with the error below. ValueError: Unexpected result of `train_function` (Empty logs). Please use `Model.compile(..., run_eagerly=True)`, or `tf.config.run_functions_eagerly(True)` for…
Ash N.
  • 11
  • 1
  • 2
0
votes
0 answers

how to finetune gpt-2 in hugging-face's pytorch transformer library

I want to finetune gpt-2 by this link https://www.modeldifferently.com/en/2021/12/generaci%C3%B3n-de-fake-news-con-gpt-2/ it works correctly in google colab. but when i run it on my lab gpu, i encounter the following error: x =…
0
votes
1 answer

How to Answer Subjective/descriptive types of lQuestions using BERT Model?

I am trying to implement BERT Model for Question Answering tasks, but Its a little different from the existing Q&A models, The Model will be given some text(3-4 pages) and will be asked questions based on the text, and the expected answer may be…
0
votes
0 answers

TrOCR fine-tuning with Text generator model like gpt-2 or Bert

I want to finetune the TrOCR transformer model (https://github.com/microsoft/unilm/tree/master/trocr) model with a different decoder like Bert or GPT-2 the dataset that I have (image, text) pair see the text in the following format(inside data…
Mohammed
  • 346
  • 1
  • 12
0
votes
0 answers

ValueError while trying to finetune a gpt-2 model

While trying to finetune gpt-2, I always get the Error "ValueError" and can´t seem to find the cause of this Error. I am using Max Wool´s Google Colab to finetune gpt-2. I followed all instructions in the Notebook and tried to start this cell: sess…
Aliniu
  • 1
0
votes
0 answers

Structure evaluation set GPT-2 text generation huggingface

I´m currently reproducing the second task (generating articles from headline) of this tutorial: https://www.modeldifferently.com/en/2021/12/generaci%C3%B3n-de-fake-news-con-gpt-2/#42-fine-tuning-to-generate-articles-from-headlines I understand that…