How to specify regularization parameter (L1 or L2) for a feed forward neural network in R using the mxnet package?

Question

I am using R mxnet package. Here is the code block that I am currently using. But I am not sure how to specify regularization.

dpLnModel <- mx.model.FeedForward.create(symbol             = out,
                                         X                  = trainX,
                                         y                  = trainY,
                                         ctx                = mx.cpu(),
                                         num.round          = numIter,
                                         eval.metric        = mx.metric.rmse,
                                         array.batch.size   = 50,
                                         array.layout       = "rowmajor",
                                         verbose            = TRUE,
                                         optimizer          = "rmsprop",
                                         eval.data          = list(data  = testX,
                                                                   label = testY
                                         ),
                                         initializer        = mx.init.normal(initValVar),
                                         epoch.end.callback = mx.callback.log.train.metric(5, logger)
)

What do the docs say? Seems like either the option is given in the docs or if not then presumably it's not an option. — Hack-R, Jul 22 '17 at 17:58

score 2 · Answer 1 · answered Jul 25 '17 at 01:54

You can set the weight_decay option of your optimizer. Weight decay is equivalent to adding a global l2 regularizer to the parameters.

optimizer = mx.SGD(lr=0.1, momentum=0.9, weight_decay=0.00001)

I am not that familiar with the R API, but judging from the Python API I would expect you to specify the optimizer in mx.fit(model, optimizer, train_provider, n_epoch=20, eval_data=eval_provider) where the first argument is an mx.FeedForward model instead of in mx.FeedForward.create.

Please see the docs for more information: https://media.readthedocs.org/pdf/mxnet-test/latest/mxnet-test.pdf

score 1 · Accepted Answer · answered Feb 13 '18 at 17:01

As @leezu's answer says, you need to set weight decay to get L2 regularisation. In the R API, the argument you need is wd e.g.

dpLnModel <- mx.model.FeedForward.create(symbol             = out,
                                         X                  = trainX,
                                         y                  = trainY,
                                         ctx                = mx.cpu(),
                                         num.round          = numIter,
                                         eval.metric        = mx.metric.rmse,
                                         array.batch.size   = 50,
                                         array.layout       = "rowmajor",
                                         verbose            = TRUE,
                                         optimizer          = "rmsprop",
                                         wd                 = 0.00001)

I think you can include any arguments from mx.opt.rmsprop. Note that the documentation there says that the default value of wd is zero i.e. no regularisation.

How to specify regularization parameter (L1 or L2) for a feed forward neural network in R using the mxnet package?

2 Answers2