13

The documentation is pretty vague and there aren't example codes to show you how to use it. The documentation for it is

Add a param group to the Optimizer s param_groups.

This can be useful when fine tuning a pre-trained network as frozen layers can be made trainable and added to the Optimizer as training progresses.

Parameters: param_group (dict) – Specifies what Tensors should be optimized along with group optimization options. (specific) –

I am assuming I can get a param_group parameter by feeding the values I get from a model's state_dict()? E.g. all the actual weight values? I am asking this because I want to make a progressive network, which means I need to constantly feed Adam parameters from newly created convolutions and activations modules.

benjaminplanche
  • 14,689
  • 5
  • 57
  • 69
Inkplay_
  • 561
  • 2
  • 5
  • 18

1 Answers1

21

Per the docs, the add_param_group method accepts a param_group parameter that is a dict. Example of use:

import torch
import torch.optim as optim


w1 = torch.randn(3, 3)
w1.requires_grad = True
w2 = torch.randn(3, 3)
w2.requires_grad = True
o = optim.Adam([w1])
print(o.param_groups)

gives

[{'amsgrad': False,
  'betas': (0.9, 0.999),
  'eps': 1e-08,
  'lr': 0.001,
  'params': [tensor([[ 2.9064, -0.2141, -0.4037],
           [-0.5718,  1.0375, -0.6862],
           [-0.8372,  0.4380, -0.1572]])],
  'weight_decay': 0}]

now

o.add_param_group({'params': w2})
print(o.param_groups)

gives:

[{'amsgrad': False,
  'betas': (0.9, 0.999),
  'eps': 1e-08,
  'lr': 0.001,
  'params': [tensor([[ 2.9064, -0.2141, -0.4037],
           [-0.5718,  1.0375, -0.6862],
           [-0.8372,  0.4380, -0.1572]])],
  'weight_decay': 0},
 {'amsgrad': False,
  'betas': (0.9, 0.999),
  'eps': 1e-08,
  'lr': 0.001,
  'params': [tensor([[-0.0560,  0.4585, -0.7589],
           [-0.1994,  0.4557,  0.5648],
           [-0.1280, -0.0333, -1.1886]])],
  'weight_decay': 0}]
iacolippo
  • 4,133
  • 25
  • 37
  • 1
    Thank you very much this is very clean and precise answer. – Inkplay_ Aug 09 '18 at 11:36
  • can we define learning rate in `add_param_group` also? if we can then how to achieve this? I could not find this in the docs. thanks – donto May 08 '20 at 09:15
  • 3
    @donto you can pass optimizer options in the dict: `o.add_param_group({'params': w2, "lr": 1e-1})` – iacolippo May 20 '20 at 08:55