0

I tried implementing an Active Learning procedure in AllenNLP v2.0.1 . However, with the current GradientDescentTrainer implementation, I am unable to continue training on a new batch of Instance.

The model (also trained using AllenNLP) has finished training for the predefined number of epochs on an initial training dataset. I restore the model using the Model.from_archive method and a Trainer is instantiated for it using the Trainer.from_params static constructor.

Thereafter, when I attempt to continue training on a new batch of Instance by calling trainer.train(), it skips training because of the following code snippet in the _try_train method,

for epoch in range(epoch_counter, self._num_epochs)

This is because the epoch_counter is restored to 5, which is from training on the initial training data previously. This is the relevant code snippet for it,

def _try_train(self) -> Tuple[Dict[str, Any], int]:
    try:
        epoch_counter = self._restore_checkpoint()

self._num_epochs is also 5 which I assume is the number of epochs defined in my .jsonnet training configuration file.

Simply, my requirement is to load an AllenNLP model that has already been trained and to continue training it on a new batch of Instances (single Instance actually which I would load using a SimpleDataLoader)

I have also attached the configuration for the Trainer below. The model I am using is a custom wrapper around the BasicClassifier, solely for the purpose of logging additional metrics.

Thanks in advance.

"trainer": {
  "num_epochs": 5,
  "patience": 1, // for early stopping
  "grad_norm": 5.0,
  "validation_metric": "+accuracy",
  "optimizer": {
    "type": "adam",
    "lr": 0.001
  },
  "callbacks": [
    {
      "type": "tensorboard"
    }
  ]
}
Ihan Dilnath
  • 343
  • 2
  • 9

1 Answers1

0

a couple of advice:

  1. don't turn the recover flag on when you run the command line or through script
  2. specify a different serialization directory for your second training