0

I'm trying to implement my own optimization algorithm for MxNet (Imperative / Gluon) that does not use gradients. My question is pretty simple is there a simple way to create new nn.Dense(...) layer initialized with parameters (i.e. Biases and Weights) represented by two nd.array() instances?

Thank you in advance!

Lu4
  • 14,873
  • 15
  • 79
  • 132
  • Just realized I answered a similar question of yours at https://github.com/apache/incubator-mxnet/issues/11133. Answer below shows implementing own block, which you didn't really want to do, but it's honestly not too tricky! Good luck and post follow up if you get stuck. – Thom Lane Jun 09 '18 at 00:31

1 Answers1

0

You can create a custom block with parameters that set differentiable=False, and provide the data for initialization through the init argument. See the scales parameter in the example below taken from this tutorial. You can also see an example of FullyConnected which you'll want to use for your dense layer too. F is used to denote a generic backend, typically this would be mx.ndarray, but after hybridization this is set to mx.symbol.

class NormalizationHybridLayer(gluon.HybridBlock):
    def __init__(self, hidden_units, scales):
        super(NormalizationHybridLayer, self).__init__()

        with self.name_scope():
            self.weights = self.params.get('weights',
                                           shape=(hidden_units, 0),
                                           allow_deferred_init=True)

            self.scales = self.params.get('scales',
                                      shape=scales.shape,
                                      init=mx.init.Constant(scales.asnumpy().tolist()), # Convert to regular list to make this object serializable
                                      differentiable=False)

    def hybrid_forward(self, F, x, weights, scales):
        normalized_data = F.broadcast_div(F.broadcast_sub(x, F.min(x)), (F.broadcast_sub(F.max(x), F.min(x))))
        weighted_data = F.FullyConnected(normalized_data, weights, num_hidden=self.weights.shape[0], no_bias=True)
        scaled_data = F.broadcast_mul(scales, weighted_data)
        return scaled_data
Thom Lane
  • 993
  • 9
  • 9