When training a PyTorch Lightning model in a Jupyter Notebook, the console log output is awkward:
Epoch 0: 100%|█████████▉| 2315/2318 [02:05<00:00, 18.41it/s, loss=1.69, v_num=26, acc=0.562]
Validating: 0it [00:00, ?it/s]
Validating: 0%| | 0/1 [00:00<?, ?it/s]
Epoch 0: 100%|██████████| 2318/2318 [02:09<00:00, 17.84it/s, loss=1.72, v_num=26, acc=0.500, val_loss=1.570, val_acc=0.564]
Epoch 1: 100%|█████████▉| 2315/2318 [02:04<00:00, 18.63it/s, loss=1.56, v_num=26, acc=0.594, val_loss=1.570, val_acc=0.564]
Validating: 0it [00:00, ?it/s]
Validating: 0%| | 0/1 [00:00<?, ?it/s]
Epoch 1: 100%|██████████| 2318/2318 [02:08<00:00, 18.07it/s, loss=1.59, v_num=26, acc=0.528, val_loss=1.490, val_acc=0.583]
Epoch 2: 100%|█████████▉| 2315/2318 [02:01<00:00, 19.02it/s, loss=1.53, v_num=26, acc=0.617, val_loss=1.490, val_acc=0.583]
Validating: 0it [00:00, ?it/s]
Validating: 0%| | 0/1 [00:00<?, ?it/s]
Epoch 2: 100%|██████████| 2318/2318 [02:05<00:00, 18.42it/s, loss=1.57, v_num=26, acc=0.500, val_loss=1.460, val_acc=0.589]
Expectingly, the "correct" output from the same training should be:
Epoch 0: 100%|██████████| 2318/2318 [02:09<00:00, 17.84it/s, loss=1.72, v_num=26, acc=0.500, val_loss=1.570, val_acc=0.564]
Epoch 1: 100%|██████████| 2318/2318 [02:08<00:00, 18.07it/s, loss=1.59, v_num=26, acc=0.528, val_loss=1.490, val_acc=0.583]
Epoch 2: 100%|██████████| 2318/2318 [02:05<00:00, 18.42it/s, loss=1.57, v_num=26, acc=0.500, val_loss=1.460, val_acc=0.589]
How comes epoch lines are uselessly repeated and split in this manner? Also I'm not sure what use the Validating
lines are, since they don't seem to provide any information.
Training and validation steps from the model are as follow:
def training_step(self, train_batch, batch_idx):
x, y = train_batch
y_hat = self.forward(x)
loss = torch.nn.NLLLoss()(torch.log(y_hat), y.argmax(dim=1))
acc = tm.functional.accuracy(y_hat.argmax(dim=1), y.argmax(dim=1))
self.log("acc", acc, prog_bar=True)
return loss
def validation_step(self, valid_batch, batch_idx):
x, y = valid_batch
y_hat = self.forward(x)
loss = torch.nn.NLLLoss()(torch.log(y_hat), y.argmax(dim=1))
acc = tm.functional.accuracy(y_hat.argmax(dim=1), y.argmax(dim=1))
self.log("val_loss", loss, prog_bar=True)
self.log("val_acc", acc, prog_bar=True)