Is there any way to preserve the internal variables of a trainer in MxNet?

Question

I wrote a program which contains an algorithm called distributed randomized gradient descent (DRGD). There are some internal variables in the algorithm which are used to calculate the step lengths. The training algorithms should be much complex than DRGD, so there should be more internal variables. If we preserve these variables, we can pause training and test the model; then, we will resume the training again.

This is a wrong question. My bad, sorry. – Blue Bird Aug 11 '18 at 02:28 — Blue Bird, Aug 11 '18 at 02:28

score 0 · Answer 1 · answered Aug 07 '18 at 21:38

If you want to store some data across multiple devices (GPUs or machines) you can use KVStore. Here is the tutorial on how to use it.

Please note, that KVStore is considered to be quite an advanced feature, and should be used with care.

I am not sure, but it could be that what you call a "Trainer" in MXNet world may actually be called an "Optimizer". So, please consider reading this API page as well.

score 0 · Accepted Answer · answered Apr 12 '19 at 19:09

It is possible to save the states of the trainer and resume training by calling the .save_states() and .load_states() functions on the Trainer class during a training with MXNet Gluon.

Here is an example:

trainer = gluon.Trainer(net.collect_params(), 'adam')
trainer.save_states('training.states')
trainer.load_states('training.states')

Is there any way to preserve the internal variables of a trainer in MxNet?

2 Answers2