0

I'm trying to find the appropriate learning rate for my Neural Network using PyTorch. I've implemented the torch.optim.lr_scheduler.CyclicLR to get the learning rate. But I'm unable to figure out what is the actual learning rate that should be selected. The dataset is MNIST_TINY.

Code:

optimizer = torch.optim.SGD(model.parameters(), lr=0.1)
scheduler = torch.optim.lr_scheduler.CyclicLR(optimizer, base_lr=1e-7, max_lr=0.1, step_size_up=5., mode='triangular')

lrs= []
for epoch in range(5):
    model.train()
    for data, label in train_dls:
        optimizer.zero_grad()
        target = model(data)
        train_step_loss = loss_fn(target, label) #CrossEntropyLoss
        train_step_loss.backward()
        optimizer.step()
        print(f'Epoch:{epoch+1} | Optim:{optimizer.param_groups[0]["lr"]:.4f} | Loss: {train_step_loss:.2f}')
        lrs.append(optimizer.param_groups[0]["lr"])
    scheduler.step()

Output

Epoch:1 | Optim:0.0000 | Loss: 0.70
Epoch:1 | Optim:0.0000 | Loss: 0.70
Epoch:1 | Optim:0.0000 | Loss: 0.70
Epoch:1 | Optim:0.0000 | Loss: 0.70
Epoch:1 | Optim:0.0000 | Loss: 0.70
Epoch:1 | Optim:0.0000 | Loss: 0.69
Epoch:1 | Optim:0.0000 | Loss: 0.69
Epoch:1 | Optim:0.0000 | Loss: 0.68
Epoch:1 | Optim:0.0000 | Loss: 0.69
Epoch:1 | Optim:0.0000 | Loss: 0.69
Epoch:1 | Optim:0.0000 | Loss: 0.69
Epoch:1 | Optim:0.0000 | Loss: 0.72
Epoch:2 | Optim:0.0200 | Loss: 0.70
Epoch:2 | Optim:0.0200 | Loss: 0.70
Epoch:2 | Optim:0.0200 | Loss: 0.70
Epoch:2 | Optim:0.0200 | Loss: 0.70
Epoch:2 | Optim:0.0200 | Loss: 0.69
Epoch:2 | Optim:0.0200 | Loss: 0.69
Epoch:2 | Optim:0.0200 | Loss: 0.69
Epoch:2 | Optim:0.0200 | Loss: 0.69
Epoch:2 | Optim:0.0200 | Loss: 0.69
Epoch:2 | Optim:0.0200 | Loss: 0.69
Epoch:2 | Optim:0.0200 | Loss: 0.70
Epoch:2 | Optim:0.0200 | Loss: 0.68
Epoch:3 | Optim:0.0400 | Loss: 0.70
Epoch:3 | Optim:0.0400 | Loss: 0.70
Epoch:3 | Optim:0.0400 | Loss: 0.70
Epoch:3 | Optim:0.0400 | Loss: 0.68
Epoch:3 | Optim:0.0400 | Loss: 0.68
Epoch:3 | Optim:0.0400 | Loss: 0.69
Epoch:3 | Optim:0.0400 | Loss: 0.70
Epoch:3 | Optim:0.0400 | Loss: 0.68
Epoch:3 | Optim:0.0400 | Loss: 0.70
Epoch:3 | Optim:0.0400 | Loss: 0.69
Epoch:3 | Optim:0.0400 | Loss: 0.70
Epoch:3 | Optim:0.0400 | Loss: 0.65
Epoch:4 | Optim:0.0600 | Loss: 0.69
Epoch:4 | Optim:0.0600 | Loss: 0.68
Epoch:4 | Optim:0.0600 | Loss: 0.68
Epoch:4 | Optim:0.0600 | Loss: 0.73
Epoch:4 | Optim:0.0600 | Loss: 0.70
Epoch:4 | Optim:0.0600 | Loss: 0.71
Epoch:4 | Optim:0.0600 | Loss: 0.71
Epoch:4 | Optim:0.0600 | Loss: 0.68
Epoch:4 | Optim:0.0600 | Loss: 0.71
Epoch:4 | Optim:0.0600 | Loss: 0.69
Epoch:4 | Optim:0.0600 | Loss: 0.69
Epoch:4 | Optim:0.0600 | Loss: 0.72
Epoch:5 | Optim:0.0800 | Loss: 0.69
Epoch:5 | Optim:0.0800 | Loss: 0.69
Epoch:5 | Optim:0.0800 | Loss: 0.70
Epoch:5 | Optim:0.0800 | Loss: 0.69
Epoch:5 | Optim:0.0800 | Loss: 0.69
Epoch:5 | Optim:0.0800 | Loss: 0.68
Epoch:5 | Optim:0.0800 | Loss: 0.71
Epoch:5 | Optim:0.0800 | Loss: 0.68
Epoch:5 | Optim:0.0800 | Loss: 0.71
Epoch:5 | Optim:0.0800 | Loss: 0.69
Epoch:5 | Optim:0.0800 | Loss: 0.71
Epoch:5 | Optim:0.0800 | Loss: 0.70

In a nutshell, I want to ask how do I find the correct learning rate? Would appreciate if anybody could show how to plot the learning rate by loss plot.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Snehangsu
  • 393
  • 1
  • 3
  • 12
  • To whoever voted to migrate this to Cross Validated: this is a *programming* question, asking how to extract the selected LR after the cyclic LR procedure in Pytorch, not about finding the "best" LR in general (which would be off-topic here indeed). Thus, the question belongs here. – desertnaut May 06 '22 at 09:07

1 Answers1

0

From https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.CyclicLR.html#torch.optim.lr_scheduler.CyclicLR

you must use get_last_lr() to get the last learning rate by this scheduler.

cao-nv
  • 184
  • 4
  • Yes, I'm aware that that's the final learning rate of the scheduler but is that the appropriate learning for my model to train? It just seems to me that it changes based on the number I'm on the epoch. That doesn't seem reliable to me. – Snehangsu May 08 '22 at 03:56
  • @Snehangu Of course, you are using a learning rate scheduler which changes your learning rate based on certain rules. If you want to keep your learning rate unchanged during the course of training, just pass a constant value when creating an optimizer. Finding a good learning rate (and other hyper-parameters) belongs to the task of hyper-parameter search. – cao-nv May 08 '22 at 04:27