ValueError: Dimensions must be equal, but are 49152 and 64 for ‘Attention_0/add' (op: 'Add')

Question

I want to try to replace the contents of the encode and decoder in this github code (i.e.,in dcrnn_model.py line 83) with the encoder and attention decoder.

These are the code before the encoder-decoder:

    max_diffusion_step = int(model_kwargs.get('max_diffusion_step', 2))
    cl_decay_steps = int(model_kwargs.get('cl_decay_steps', 1000))
    filter_type = model_kwargs.get('filter_type', 'laplacian')
    horizon = int(model_kwargs.get('horizon', 1))
    max_grad_norm = float(model_kwargs.get('max_grad_norm', 5.0))
    num_nodes = int(model_kwargs.get('num_nodes', 1))
    num_rnn_layers = int(model_kwargs.get('num_rnn_layers', 1))
    rnn_units = int(model_kwargs.get('rnn_units'))
    seq_len = int(model_kwargs.get('seq_len'))
    use_curriculum_learning = bool(model_kwargs.get('use_curriculum_learning', False))
    input_dim = int(model_kwargs.get('input_dim', 1))
    output_dim = int(model_kwargs.get('output_dim', 1))
    aux_dim = input_dim - output_dim

    # Input (batch_size, timesteps, num_sensor, input_dim)
    self._inputs = tf.placeholder(tf.float32, shape=(batch_size, seq_len, num_nodes, input_dim), name='inputs')
    # Labels: (batch_size, timesteps, num_sensor, input_dim), same format with input except the temporal dimension.
    self._labels = tf.placeholder(tf.float32, shape=(batch_size, horizon, num_nodes, input_dim), name='labels')

    # GO_SYMBOL = tf.zeros(shape=(batch_size, num_nodes * input_dim))
    GO_SYMBOL = tf.zeros(shape=(batch_size, num_nodes * output_dim))

    cell = DCGRUCell(rnn_units, adj_mx, max_diffusion_step=max_diffusion_step, num_nodes=num_nodes,
                     filter_type=filter_type)

    cell_with_projection = DCGRUCell(rnn_units, adj_mx, max_diffusion_step=max_diffusion_step, num_nodes=num_nodes,
                                     num_proj=output_dim, filter_type=filter_type)
    encoding_cells = [cell] * num_rnn_layers
    decoding_cells = [cell] * (num_rnn_layers - 1) + [cell_with_projection]
    encoding_cells = tf.contrib.rnn.MultiRNNCell(encoding_cells, state_is_tuple=True)
    decoding_cells = tf.contrib.rnn.MultiRNNCell(decoding_cells, state_is_tuple=True)

    global_step = tf.train.get_or_create_global_step()
    # Outputs: (batch_size, timesteps, num_nodes, output_dim)
    with tf.variable_scope('DCRNN_SEQ'):
        inputs = tf.unstack(tf.reshape(self._inputs, (batch_size, seq_len, num_nodes * input_dim)), axis=1)
        labels = tf.unstack(
            tf.reshape(self._labels[..., :output_dim], (batch_size, horizon, num_nodes * output_dim)), axis=1)
        if aux_dim > 0:
            aux_info = tf.unstack(self._labels[..., output_dim:], axis=1)
            aux_info.insert(0, None)
        labels.insert(0, GO_SYMBOL)

        def _loop_function(prev, i):
            if is_training:
                # Return either the model's prediction or the previous ground truth in training.
                if use_curriculum_learning:
                    c = tf.random_uniform((), minval=0, maxval=1.)
                    threshold = self._compute_sampling_threshold(global_step, cl_decay_steps)
                    result = tf.cond(tf.less(c, threshold), lambda: labels[i], lambda: prev)
                else:
                    result = labels[i]
            else:
                # Return the prediction of the model in testing.
                result = prev
            if False and aux_dim > 0: 
                result = tf.reshape(result, (batch_size, num_nodes, output_dim))
                result = tf.concat([result, aux_info[i]], axis=-1)
                result = tf.reshape(result, (batch_size, num_nodes * input_dim))
            return result

This is the original code for encoder-decoder:

#DCRNN-encoder-decoder
_, enc_state = tf.contrib.rnn.static_rnn(encoding_cells, inputs, dtype=tf.float32)#encoder
outputs, final_state = legacy_seq2seq.rnn_decoder(labels, enc_state, decoding_cells,
                                                          loop_function=_loop_function)#decoder

My code is as follows:

# Encoder and Attention_decoder
encoder_outputs, enc_state = tf.contrib.rnn.static_rnn(encoding_cells, inputs, dtype=tf.float32)#encoder
# First calculate a concatenation of encoder outputs to put attention on.
top_states = [tf.reshape(encoder_outputs,[-1, 1, decoding_cells.output_size])]
attention_states = tf.concat(top_states,1)
outputs, final_state = legacy_seq2seq.attention_decoder(labels, enc_state,attention_states, decoding_cells,loop_function=_loop_function)#attention_decoder

However, such a dimensional bug has occurred:

ValueError: Dimensions must be equal, but are 49152 and 64 for 'Train/DCRNN/DCRNN_SEQ/attention_decoder/Attention_0/add' (op: 'Add') with input shapes: [49152,1,1,207], [64,1,1,207].

You have to make both shapes the same dimensions as stated by your error. It cannot get any clearer than that. You should try verifying your inputs are correct or converting either shape into the same dimensions of the other shape. — ycx, Jan 03 '19 at 08:06
Thank you, I know that it may be that the ‘encoder_outputs’ of the encoder is not equal to the dimensions of the decoder, but I don't know how to find and fix this bug. — Wendong Zheng, Jan 04 '19 at 02:01
Have you tried manually changing the dimension of the input shapes with either an image resizer or putting in data that will return you the equal shapes? — ycx, Jan 04 '19 at 02:11
This data is time series data rather than image data. The original code in github matches the encoder-decoder dimensions, but I added the attention_states and it doesn't match... — Wendong Zheng, Jan 04 '19 at 02:29
What is the first dimension data type? You could possibly add in some filler values in that list to at least get out an output and then further understand the problem from there — ycx, Jan 04 '19 at 02:38
The data type is float32. I used the printf() to show the shape of the variable involved in the code: cell.shape: () cell_with_projection.shape: () encoding_cells.shape: (2,) decoding_cells.shape: (2,) encoding_cells.shape: () decoding_cells.shape: () encoder_outputs.shape: (12,) attention_states.shape: (49152, 1, 207) labels.shape: (13,) enc_state.shape: (2,) decoding_cells.shape: () — Wendong Zheng, Jan 04 '19 at 07:23
2019-01-04 15:18:21,747 - INFO - ('y_val', (3425, 12, 207, 2)) 2019-01-04 15:18:21,748 - INFO - ('y_test', (6850, 12, 207, 2)) 2019-01-04 15:18:21,749 - INFO - ('x_val', (3425, 12, 207, 2)) 2019-01-04 15:18:21,749 - INFO - ('y_train', (23974, 12, 207, 2)) 2019-01-04 15:18:21,750 - INFO - ('x_train', (23974, 12, 207, 2)) 2019-01-04 15:18:21,750 - INFO - ('x_test', (6850, 12, 207, 2)) — Wendong Zheng, Jan 04 '19 at 07:24

ValueError: Dimensions must be equal, but are 49152 and 64 for ‘Attention_0/add' (op: 'Add')

0 Answers0