0

I am downloading the model https://huggingface.co/microsoft/Multilingual-MiniLM-L12-H384/tree/main microsoft/Multilingual-MiniLM-L12-H384 and then using it.

Transformer Version: '4.11.3'

I have written the below code:

import wandb
wandb.login()
%env WANDB_LOG_MODEL=true

model = tr.BertForSequenceClassification.from_pretrained("/home/pc/minilm_model",num_labels=2)
model.to(device)

print("hello")

training_args = tr.TrainingArguments(
report_to = 'wandb',
output_dir='/home/pc/proj/results2', # output directory
num_train_epochs=10, # total number of training epochs
per_device_train_batch_size=16, # batch size per device during training
per_device_eval_batch_size=32, # batch size for evaluation
learning_rate=2e-5,
warmup_steps=1000, # number of warmup steps for learning rate scheduler
weight_decay=0.01, # strength of weight decay
logging_dir='./logs', # directory for storing logs
logging_steps=1000,
evaluation_strategy="epoch",
save_strategy="no"
)

print("hello")

trainer = tr.Trainer(
model=model, # the instantiated  Transformers model to be trained
args=training_args, # training arguments, defined above
train_dataset=train_data, # training dataset
eval_dataset=val_data, # evaluation dataset
compute_metrics=compute_metrics
)

After Executing this:

The model stuck at this point:

***** Running training *****

Num examples = 12981
 Num Epochs = 20
 Instantaneous batch size per device = 16
 Total train batch size (w. parallel, distributed & accumulation) = 32
 Gradient Accumulation steps = 1
 Total optimization steps = 8120
Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"

What could be the possible solution?

MAC
  • 1,345
  • 2
  • 30
  • 60

1 Answers1

1

I'm not sure why this would have stopped training.

If you post to the HF forum, maybe someone there could help you: https://discuss.huggingface.co

I work for W&B so if you think it's related to using W&B or if you have any questions, I can help you here or on our forums. http://community.wandb.ai

Scott Condron
  • 1,902
  • 16
  • 20