How to use 'run_glue.py' in HuggingFace to finetune for classification?

Question

Here is all the "documentation" I could find https://github.com/huggingface/transformers/tree/main/examples/pytorch/text-classification

I honestly don't see how you're supposed to know how to use this thing with the resources I found online unless you made the script yourself.

I want to finetune a RoBERTa on a classification task with my own (.json) dataset and my own checkpoint.

DATA_SET='books'
MODEL='RoBERTa_small_fr_huggingface' # I have sentencepiece.bpe.model and pytorch_model.bin what do I use?
MAX_SENTENCES= '8' #batch size
LR = '1e-5'
MAX_EPOCH= '5'
NUM_CLASSES= '2'
SEEDS=3
CUDA_VISIBLE_DEVICES=0


TASK= 'sst2' #??
DATA_PATH= 'data/cls-books-json/' # with test.json, train.json, valid.json

for SEED in range(SEEDS):
  SAVE_DIR= 'checkpoints/'+TASK+'/'+DATA_SET+'/'+MODEL+'_ms'+str(MAX_SENTENCES)+'_lr'+str(LR)+'_me'+str(MAX_EPOCH)+'/'+str(SEED)
  !(python3 libs/transformers/examples/pytorch/text-classification/run_glue.py \
       --model_name_or_path $MODEL \
       --task_name $TASK_NAME \
       --do_train \
       --do_eval \
       --output_dir /tmp/hf)

So far I get this error:

run_glue.py: error: argument --model_name_or_path: expected one argument

But I'm sure it's not the only problem.

How to use 'run_glue.py' in HuggingFace to finetune for classification?

0 Answers0