Questions tagged [fine-tune]

156 questions
2
votes
1 answer

How to refine a trained model in gpt2?

Im currently trying to work on text generation with my own text. I have trained my model with gpt2 with my own text. But it is giving random answers. For some questions it is giving me relevant answers. Is there a way to fine tune it further or can…
2
votes
0 answers

How to fine-tune spacy-experimental "en-coreference-web-trf" model on my own custom domain dataset

I have a custom dataset of conversational data specific to farming domain. The spacy-experimental coreference model (en-coreference-web-trf) does perform okish in coreference resolution but does not give the required accuracy. So I need to further…
2
votes
1 answer

Look for good ways to prepare customized dataset for training controlnet with huggingface diffusers

I want to personally train the controlnet, but I find it inconvenient to prepare the datasets. As I follow the huggingface tutorial available at this link: https://huggingface.co/blog/train-your-controlnet, I believe I should organize the dataset in…
Yun
  • 21
  • 2
2
votes
1 answer

Fine-tuning a pre-trained LLM for question-answering

Objective My goal is to fine-tune a pre-trained LLM on a dataset about Manchester United's (MU's) 2021/22 season (they had a poor season). I want to be able to prompt the fine-tuned model with questions such as "How can MU improve?", or "What are…
2
votes
1 answer

How can I fine-tune mBART-50 for machine translation in the transformers Python library so that it learns a new word?

I try to fine-tune mBART-50 (paper, pre-trained model on Hugging Face) for machine translation in the transformers Python library. To test the fine-tuning, I am trying to simply teach mBART-50 a new word that I made up. I use the following code.…
2
votes
1 answer

validation loss shows 'no log' during fine-tuning model

I'm finetuning QA models from hugging face pretrained models using huggingface Trainer, during the training process, the validation loss doesn't show. My compute_metrices function returns accuracy and f1 score, which doesn't show in the log as…
2
votes
1 answer

openai multiclass classification logprobs doesn't return defined classes, instead it returns one class and variations of it

As stated in the title, the multiclass classification doesn't return the correct classes I defined in the training set, instead it returns the first class (predicted class) and other classes are just a variation of it. example request: curl…
2
votes
2 answers

Fine-tune a davinci model to be similar to InstructGPT

I have a few-shot GPT-3 text-davinci-003 prompt that produces "pretty good" results, but I quickly run out of tokens per request for interesting use cases. I have a data set (n~20) which I'd like to train the model with more but there is no way to…
Incognito
  • 20,537
  • 15
  • 80
  • 120
2
votes
2 answers

What are the differences between adapter tuning and prefix tuning?

I am trying to understand the concept of adapter-tuning, prompt-tuning, and prefix-tuning in the context of few-shot learning. It appears to me that I can apply prompt tuning to a black box language model. I read for prompt tuning the entire…
2
votes
1 answer

HuggingFace Trainer do predictions

I've been fine-tuning a Model from HuggingFace via the Trainer-Class. I went through the Training Process via trainer.train() and also tested it with trainer.evaluate(). My question is how I can run the Model on specific data. In case of a…
Infomagier
  • 171
  • 1
  • 3
  • 9
2
votes
1 answer

GPT-J and GPT-Neo generate too long sentences

I trained a GPT-J and GPT-Neo models (fine tuning) on my texts and am trying to generate new text. But very often the sentences are very long (sometimes 300 characters each), although in the dataset the sentences are of normal length (50-100…
Astraport
  • 1,239
  • 4
  • 20
  • 40
2
votes
0 answers

Torchvision RetinaNet predicts unwanted class background

I want to train the pretrained RetinaNet from torchvision with my custom dataset with 2 classes (without background). To train with RetinaNet, I did follow modifications: num_classes = 3 # num of objects to identify + background class model =…
2
votes
0 answers

Finetuning Transformers in PyTorch (BERT, RoBERTa, etc.)

Alright. So there are multiple methods to fine tune a transformer: freeze transformer's parameters and only its final outputs are fed into another model (user trains this "another" model), the whole transformer, with a user-added custom layer, is…
2
votes
3 answers

RuntimeError: Found dtype Long but expected Float when fine-tuning using Trainer API

I'm trying to fine-tune BERT model for sentiment analysis (classifying text as positive/negative) with Huggingface Trainer API. My dataset has two columns, Text and Sentiment, it looks like this. Text Sentiment This was good…
2
votes
1 answer

Finetune mBART on pre-train tasks using HuggingFace

I would like to finetune facebook/mbart-large-cc25 on my data using pre-training tasks, in particular Masked Language Modeling (MLM). How can I do that in HuggingFace? Edit: rewrote the question for the sake of clarity
1
2
3
10 11