I would like to set specific learning rates for each parameter on their lowest level. I.e. each value in a kernels weight and biases should have their own learning rate.
I can specify filter-wise learning rates like that:
optim = torch.optim.SGD([{'params': model.conv1.weight, 'lr': 0.1},], lr=0.01)
But when I want to get a level lower, like that:
optim = torch.optim.SGD([{'params': model.conv1.weight[0, 0, 0, 0], 'lr': 0.1},], lr=0.01)
I receive an error: ValueError: can't optimize a non-leaf Tensor
I also tried specifying a learning rate that has the same shape as the filter such as 'lr': torch.ones_like(model.conv1.weight)
, but that also didn't work out.
Is there even a way to do this using torch.optim
?