6

I'm a bit new to Keras and deep learning. I'm currently trying to replicate this paper but when I'm compiling the first model (without the LSTMs) I get the following error:

"ValueError: Error when checking target: expected dense_3 to have shape (None, 120, 40) but got array with shape (8, 40, 1)"

The description of the model is this:

  1. Input (length T is appliance specific window size)
  2. Parallel 1D convolution with filter size 3, 5, and 7 respectively, stride=1, number of filters=32, activation type=linear, border mode=same
  3. Merge layer which concatenates the output of parallel 1D convolutions
  4. Dense layer, output_dim=128, activation type=ReLU
  5. Dense layer, output_dim=128, activation type=ReLU
  6. Dense layer, output_dim=T , activation type=linear

My code is this:

from keras import layers, Input
from keras.models import Model

# the window sizes (seq_length?) are 40, 1075, 465, 72 and 1246 for the kettle, dish washer,
# fridge, microwave, oven and washing machine, respectively.

def ae_net(T):
    input_layer = Input(shape= (T,))
    branch_a = layers.Conv1D(32, 3, activation= 'linear', padding='same', strides=1)(input_layer)
    branch_b = layers.Conv1D(32, 5, activation= 'linear', padding='same', strides=1)(input_layer)
    branch_c = layers.Conv1D(32, 7, activation= 'linear', padding='same', strides=1)(input_layer)

    merge_layer = layers.concatenate([branch_a, branch_b, branch_c], axis=1)

    dense_1 = layers.Dense(128, activation='relu')(merge_layer)
    dense_2 =layers.Dense(128, activation='relu')(dense_1)
    output_dense = layers.Dense(T, activation='linear')(dense_2)
    model = Model(input_layer, output_dense)
    return model

model = ae_net(40)
model.compile(loss= 'mean_absolute_error', optimizer='rmsprop')
model.fit(X, y, batch_size= 8)

where X and y are numpy arrays of 8 sequences of a length of 40 values. So X.shape and y.shape are (8, 40, 1). It's actually one batch of data. The thing is I cannot understand how the output would be of shape (None, 120, 40) and what these sizes would mean.

Cœur
  • 37,241
  • 25
  • 195
  • 267
itroulli
  • 2,044
  • 1
  • 10
  • 21

1 Answers1

2

As you noted, your shapes contain batch_size, length and channels: (8,40,1)

Your three convolutions are, each one, creating a tensor like (8,40,32). Your concatenation in the axis=1 creates a tensor like (8,120,32), where 120 = 3*40.

Now, the dense layers only work on the last dimension (the channels in this case), leaving the length (now 120) untouched.

Solution

Now, it seems you do want to keep the length at the end. So you won't need any flatten or reshape layers. But you will need to keep the length 40, though.

You're probably doing the concatenation in the wrong axis. Instead of the length axis (1), you should concatenate in the channels axis (2 or -1).

So, this should be your concatenate layer:

merge_layer = layers.Concatenate()([branch_a, branch_b, branch_c])
#or layers.Concatenate(axis=-1)([branch_a, branch_b, branch_c])

This will output (8, 40, 96), and the dense layers will transform the 96 in something else.

Community
  • 1
  • 1
Daniel Möller
  • 84,878
  • 18
  • 192
  • 214
  • Just tried to edit my code with your solution but now I'm getting a different error, irrelevant with the correction. The error is: "ValueError: Input 0 is incompatible with layer conv1d_3: expected ndim=3, found ndim=2" in the line with the first convolution layer (branch_a). I cannot figure out why this pops up even when I run what seems to be identical previously working to this point code. – itroulli Jan 03 '18 at 17:37
  • `input_layer = Input(shape= (T,1))` – Daniel Möller Jan 03 '18 at 17:48
  • Figured this out but now I get the following error: `"ValueError: Error when checking target: expected dense_3 to have shape (None, 40, 40) but got array with shape (8, 40, 1)"` with concatenation on axis=-1 – itroulli Jan 03 '18 at 17:53
  • This problem is here: `output_dense = layers.Dense(T, activation='linear')(dense_2)`. Your last dense layer should have only 1 neuron (instead of T) if your data has only one value (8,40,1). – Daniel Möller Jan 03 '18 at 18:35
  • It's a seq2se problem and I want the output to be a sequence of the same length as the input. I copied the parameters from the description of the model I found in the paper (as I posted it in my original post). I'll try to run it the way you suggested and come back with feedback. Thanks a lot! :) – itroulli Jan 03 '18 at 18:56
  • It worked! Thank you! :) I have another problem with the second model right now but probably I should post another question about that. – itroulli Jan 10 '18 at 14:35
  • Ok :) -- If you consider this answers your question, please mark it as answered :) – Daniel Möller Jan 10 '18 at 15:04
  • I have just posted the problem I have with the second network [here](https://stackoverflow.com/questions/48407346/typeerror-when-trying-to-create-a-blstm-network-in-keras). Take a look when you have the time! Thanks a lot! :) – itroulli Jan 23 '18 at 17:16