1

What is the recommended operation to protect some weights from being changed by the trainer in MxNet?

As far as I know, if I want to protect some weights in TenserFlow, I should prevent them from being passed to the optimizer. So, I do the same in MxNet with following codes.

all_params = net.collect_params()

 while True:

    firstKey = next(iter(all_params._params))

    if 'resnet' not in firstKey:

        break

    all_params._params.popitem(last = False)
trainer = mx.gluon.Trainer(all_params,'sgd')

The variable all_params._params belongs to a rare type called OrderedDict. I think it means that the order in this dictionary is very important. I should not change the order. As shown above, I can only remove some parameters from the beginning of the network. It is very inconvenient. The ”params” gets a ”underline _” at the beginning, which means it should not be charged by the general user.

I do not receive any errors, but I wonder this is not the recommended operation.

Tim
  • 10,459
  • 4
  • 36
  • 47
Blue Bird
  • 318
  • 2
  • 9

1 Answers1

3

As far as I understand, you want to freeze some layers (so their parameters remains unchanged during training) and you are using Gluon.

In that case you can set grad_req attribute to 'null' (it is a string) to prevent changes of this parameter. Here is the example. I define a set of parameter names I want to freeze and freeze them after creating my model, but before the initialization.

num_hidden = 10
net = gluon.nn.Sequential()
with net.name_scope():
    net.add(gluon.nn.Dense(num_hidden, activation="relu"))
    net.add(gluon.nn.Dense(num_hidden, activation="relu"))
    net.add(gluon.nn.Dense(num_outputs))

layers_to_freeze = set(['sequential1_dense0_weight', 'sequential1_dense0_bias', 'sequential1_dense1_weight', 'sequential1_dense1_bias'])    

for p in net.collect_params().items():
    if p[0] in layers_to_freeze:
        p[1].grad_req = 'null'

net.collect_params().initialize(mx.init.Xavier(magnitude=2.24), ctx=ctx)

If you run training, these parameters shouldn't change. You can find names of parameters by printing p[0] in the loop.

Sergei
  • 1,617
  • 15
  • 31
  • Another professor has suggested a strategy, and I cite the URL here for exchanging ideas. (https://discuss.mxnet.io/t/what-is-the-recommended-operation-to-protect-some-weights-from-being-changed-by-the-trainer-in-mxnet/1589) – Blue Bird Aug 10 '18 at 15:20