Questions tagged [gpt-2]

Use this tag with Generative Pre-trained Transformer 2 (GPT-2). Do not use with GPT-3 or the ad tagging library (GPT).

References

See the GPT-2 definition on Wikipedia.

Related Tags

199 questions
0
votes
1 answer

Why is my streamlit app not correctly summarizing my mp3-transcription?

I'm working on a Streamlit app that processes MP3 files. The main steps include: Uploading an MP3 file. Splitting the audio into smaller chunks using pydub. Transcribing these chunks using OpenAI. Summarizing the transcriptions using transformers…
0
votes
1 answer

How do fix GPT2 Tokenizer error in Langchain map_reduce (LLama2)?

I'm using AWS Sagemaker Jumpstart model for Llama2 13b: meta-textgeneration-llama-2-13b-f On running a Langchain summarize chain with chain_type="map_reduce" I get the below error. I do not have access to https://huggingface.co from my environment.…
apprunner2186
  • 217
  • 1
  • 6
0
votes
0 answers

ValueError: Expected input batch_size (1052) to match target batch_size (508) when fine tuning GPT 2 model

Hello there I'm attempting to train a GPT 2 model how to summarize passages without compromising their emotional impact. Consider summarizing a chapter from a book, but we want the reader to experience the same emotions as the chapter itself. I…
Damika
  • 622
  • 2
  • 8
  • 17
0
votes
0 answers

'utf-8' codec can't decode byte 0xc3 error when using tensorflow.keras.layers import TextVectorization

I am trying to execute the steps given in a blog post (https://stackabuse.com/gpt-style-text-generation-in-python-with-tensorflowkeras/) but getting the error in the below block: vectorize_layer.adapt(text_list) vocab =…
Vaibhav
  • 102
  • 1
  • 9
0
votes
0 answers

GPT-2 PyTorch Custom Training

I have this implementation of OpenAI GPT-2 LLM in PyTorch. Can you please help me write a training loop for it, if my dataset looks like this text file: <|user|>How are you?<|bot|>I’m fine<|endoftext|> <|user|>What films do you like?<|bot|>I like…
Yuki Arimo
  • 38
  • 6
0
votes
0 answers

tf.compat.v1.estimator.Estimator(): NameError: name 'model_fn' is not defined.Getting errors in add_argument as well. Not recognizing paths mentioned

I am trying to create a pet LLM using GPT-2 following instructions here: https://thomascherickal.medium.com/how-to-create-your-own-llm-model-2598615a039a The code gives syntax error while calling tf.compat.v1.estimator.Estimator() with model_fn as…
0
votes
0 answers

Title: Generating Sentences with TRL while Maintaining Sentiment - Issue with "AutoModelForCausalLMWithValueHead"

I am currently working on generating sentences with TRL (Transformers Reinforcement Learning) while preserving the same sentiment as the sample sentences. However, I've come across an issue with the TRL code that uses…
0
votes
1 answer

Hugging Face Inference API returning short generated text with GPT-2 model

I'm using the Hugging Face API with a GPT-2 model to generate text based on a prompt. However, I'm encountering an issue where the generated text is consistently too short, even though I'm specifying a maximum number of new tokens and using other…
0
votes
1 answer

How to generate text using GPT2 model with Huggingface transformers?

I wanted to use GPT2Tokenizer, AutoModelForCausalLM for generating (rewriting) sample text. I have tried transformers==4.10.0, transformers==4.30.2 and --upgrade git+https://github.com/huggingface/transformers.git, however I get the error of…
0
votes
0 answers

Non-meaningful response from finetuned GPT-2 model

I am experimenting with the abilities of GPT-2 for question answering aiming at making a good task-based chatbot. I trained my model on the air_dialogue dataset from huggingface https://huggingface.co/datasets/air_dialogue. I used the code form…
Chukwujike
  • 11
  • 4
0
votes
0 answers

FineTune GPT2 on Insurance Domain data

I am new to LLM and I am trying to finetune GPT2 from Huggingface on Insurance domain data. I am not getting results from the Trained data instead I am getting different results. My Training Data is a word document (The content is not like Question…
0
votes
0 answers

How to train gpt2 model to learn from the training text I have given?

I'm trying to train and fine tune my gpt2 model with my own sample training document. I'm using the code similar to this: https://www.kaggle.com/code/changyeop/how-to-fine-tune-gpt-2-for-beginners . But the text generated is not related to any text…
0
votes
1 answer

Unsupervised fine-tuning on custom documents after the supervised fine tuning on general question-answers dataset. Will it be useful for GPT-2 model?

I know the formal way of training a GPT2 model on custom documents is to first do semi-supervised fine tuning on the text of the documents followed by supervised fine-tuning on question answers from the same documents. But the sole purpose of…
0
votes
0 answers

tf.compat.v1.estimator.Estimator(): NameError: name 'model_fn' is not defined

I am trying to create a pet LLM using GPT-2 following instructions here: https://thomascherickal.medium.com/how-to-create-your-own-llm-model-2598615a039a The code gives syntax error while calling tf.compat.v1.estimator.Estimator() with model_fn as…
sm535
  • 587
  • 7
  • 20
0
votes
0 answers

TypeError: argmax(): argument 'input' (position 1) must be Tensor, not numpy.ndarray

I am traning a model GPT-2 using my curated dataset, and getting the following error. When I am trying to debug any issue, a new error comes. Can anyone helpme to fix my script so that it can run. The traing process starts but later gets many…