2

I am using R mxnet package. Here is the code block that I am currently using. But I am not sure how to specify regularization.

dpLnModel <- mx.model.FeedForward.create(symbol             = out,
                                         X                  = trainX,
                                         y                  = trainY,
                                         ctx                = mx.cpu(),
                                         num.round          = numIter,
                                         eval.metric        = mx.metric.rmse,
                                         array.batch.size   = 50,
                                         array.layout       = "rowmajor",
                                         verbose            = TRUE,
                                         optimizer          = "rmsprop",
                                         eval.data          = list(data  = testX,
                                                                   label = testY
                                         ),
                                         initializer        = mx.init.normal(initValVar),
                                         epoch.end.callback = mx.callback.log.train.metric(5, logger)
)
  • 1
    What do the docs say? Seems like either the option is given in the docs or if not then presumably it's not an option. – Hack-R Jul 22 '17 at 17:58

2 Answers2

2

You can set the weight_decay option of your optimizer. Weight decay is equivalent to adding a global l2 regularizer to the parameters.

optimizer = mx.SGD(lr=0.1, momentum=0.9, weight_decay=0.00001)

I am not that familiar with the R API, but judging from the Python API I would expect you to specify the optimizer in mx.fit(model, optimizer, train_provider, n_epoch=20, eval_data=eval_provider) where the first argument is an mx.FeedForward model instead of in mx.FeedForward.create.

Please see the docs for more information: https://media.readthedocs.org/pdf/mxnet-test/latest/mxnet-test.pdf

leezu
  • 512
  • 3
  • 17
1

As @leezu's answer says, you need to set weight decay to get L2 regularisation. In the R API, the argument you need is wd e.g.

dpLnModel <- mx.model.FeedForward.create(symbol             = out,
                                         X                  = trainX,
                                         y                  = trainY,
                                         ctx                = mx.cpu(),
                                         num.round          = numIter,
                                         eval.metric        = mx.metric.rmse,
                                         array.batch.size   = 50,
                                         array.layout       = "rowmajor",
                                         verbose            = TRUE,
                                         optimizer          = "rmsprop",
                                         wd                 = 0.00001)

I think you can include any arguments from mx.opt.rmsprop. Note that the documentation there says that the default value of wd is zero i.e. no regularisation.