1

I am using BucketingModule for training multiple small models/bots together. Here, the bucket key is bot_id. However, each bot has separate set of target labels/classes (and hence, different size of softmax layer for each bot).

Is there any way to train such a model in mxnet, where I want to share the weights for all the layers but one (softmax) among all the bots?

How would I initialize such a model using sym_gen method? If in the sym_gen method, for the Softmax layer I specify the num_hidden=size_dict[bot] i.e.,

pred = mx.sym.FullyConnected(data=pred, num_hidden=len(size_dict[bot]), name='pred')
pred = mx.sym.SoftmaxOutput(data=pred, label=label, name='softmax')

I get the error:

Inferred shape does not match shared_exec.arg_array's shape

which makes sense as each bot has different number of target classes.

quinz
  • 1,282
  • 4
  • 21
  • 33

1 Answers1

0

This issue was posted and resolved here: https://github.com/apache/incubator-mxnet/issues/9042

You can make sym_gen(default_bucket_key) returns a "master network" that contains all these FC layers of different shapes, and sym_gen(other_keys) returns a subset of the master network with one particular FC. Note that for the master network, you probably need to use mx.sym.Group to group all outputs together so only one symbol is returned.

Sina Afrooze
  • 960
  • 6
  • 11