mxnet: how to debug models with mismatched shapes

Question

I am trying to modify a model I found online (https://github.com/apache/incubator-mxnet/tree/master/example/multivariate_time_series) as I work to get to know mxnet. I am trying to build a model that takes both a CNN and RNN network in parallel and then uses the outputs of both to forecast a time series. However, I am running into this error

RuntimeError: simple_bind error. Arguments: data: (128, 96, 20) softmax_label: (128, 20) Error in operator concat1: [15:44:09] src/operator/nn/concat.cc:66: Check failed: shape_assign(&(*in_shape)[i], dshape) Incompatible input shape: expected [128,0], got [128,96,300]

This is the code, as I have tried to modify it:

def rnn_cnn_model(iter_train, q, filter_list, num_filter, dropout, seasonal_period, time_interval):


# Choose cells for recurrent layers: each cell will take the output of the previous cell in the list
rcells = [mx.rnn.GRUCell(num_hidden=args.recurrent_state_size)]
skiprcells = [mx.rnn.LSTMCell(num_hidden=args.recurrent_state_size)]


input_feature_shape = iter_train.provide_data[0][1]
X = mx.symbol.Variable(iter_train.provide_data[0].name)
Y = mx.sym.Variable(iter_train.provide_label[0].name)

# reshape data before applying convolutional layer (takes 4D shape incase you ever work with images)
rnn_input = mx.sym.reshape(data=X, shape=(0, q, -1))

###############
# RNN Component
###############
stacked_rnn_cells = mx.rnn.SequentialRNNCell()
for i, recurrent_cell in enumerate(rcells):
    stacked_rnn_cells.add(recurrent_cell)
    stacked_rnn_cells.add(mx.rnn.DropoutCell(dropout))
outputs, states = stacked_rnn_cells.unroll(length=q, inputs=rnn_input, merge_outputs=False)
rnn_features = outputs[-1] #only take value from final unrolled cell for use later

input_feature_shape = iter_train.provide_data[0][1]
X = mx.symbol.Variable(iter_train.provide_data[0].name)
Y = mx.sym.Variable(iter_train.provide_label[0].name)

# reshape data before applying convolutional layer (takes 4D shape incase you ever work with images)
conv_input = mx.sym.reshape(data=X, shape=(0, 1, q, -1))

###############
# CNN Component
###############
outputs = []
for i, filter_size in enumerate(filter_list):
    # pad input array to ensure number output rows = number input rows after applying kernel
    padi = mx.sym.pad(data=conv_input, mode="constant", constant_value=0,
                      pad_width=(0, 0, 0, 0, filter_size - 1, 0, 0, 0))
    convi = mx.sym.Convolution(data=padi, kernel=(filter_size, input_feature_shape[2]), num_filter=num_filter)
    acti = mx.sym.Activation(data=convi, act_type='relu')
    trans = mx.sym.reshape(mx.sym.transpose(data=acti, axes=(0, 2, 1, 3)), shape=(0, 0, 0))
    outputs.append(trans)
cnn_features = mx.sym.Concat(*outputs, dim=2)
cnn_reg_features = mx.sym.Dropout(cnn_features, p=dropout)
c_features = mx.sym.reshape(data = cnn_reg_features, shape = (-1))
print(type(c_features))
######################
# Prediction Component
######################

print(rnn_features.infer_shape())   
neural_components = mx.sym.concat(*[rnn_features, c_features], dim=1)
neural_output = mx.sym.FullyConnected(data=neural_components, num_hidden=input_feature_shape[2])
model_output = neural_output
loss_grad = mx.sym.LinearRegressionOutput(data=model_output, label=Y)
return loss_grad, [v.name for v in iter_train.provide_data], [v.name for v in iter_train.provide_label]

and I believe the crash is happening on this line of code

neural_components = mx.sym.concat(*[rnn_features, c_features], dim=1)

Here is what I have tried in an effort to get my dimensions to match up:

c_features = mx.sym.reshape(data = cnn_reg_features, shape = (-1))
c_features = cnn_reg_features[-1]
c_features = cnn_reg_features[:, -1, :]

I also tried to look at the git issues and Google around, but all I see is advice to use infer_shape. I tried applying this to c_features, but the output was not clear to me

data: ()
gru_i2h_weight: ()
gru_i2h_bias: ()

Basically, I would like to know at each stage as this graph is built what the shape of the symbol is. I am used to this capability in Tensorflow, which makes it easier to build and debug graphs when one has gone astray in doing an incorrect reshape, or simply for getting the sense of how a model works by looking at its dimension. Is there no equivalent opportunity in mxnet?

Given that the data_iter is fed in when producing these symbols I would think the inferred shape should be available. Ultimately my questions are (1) how can I see that shape of a symbol when it uses the data in the iterator and should know all shapes? (2) general guidelines on debugging in this sort of situation?

Thank you.

Debugging Symbol network is quite painful. If you want, I recommend to switch to Gluon API - it allows easier debugging. The LSTNet network you are using is implemented using Gluon API here - https://github.com/safrooze/LSTNet-Gluon — Sergei, May 13 '19 at 23:32

mxnet: how to debug models with mismatched shapes

0 Answers0