1

I looked at various responses already but I dont understand why I am constantly getting (10, 5).

Why is it asking for a shape of (10,5)? Where is it even getting that number from? I am under the impression that the shape of the input data should be ("sample_size", "steps or time_len", "channels or feat_size") => (3809, 49, 5).

I am also under the impression that the input shape for Conv1D layer should be ("steps or time_len", "channels or feat_size").

Am I misunderstanding something?

My input data looks something like this: enter image description here

There is a total of 49 days, 5 data points per each day. There is a total of 5079 sample size. 75% of the data for training, 25% for validation. 10 possible prediction output answers.

x_train, x_test, y_train, y_test = train_test_split(np_train_data, np_train_target, random_state=0)
print(x_train.shape)
x_train = x_train.reshape(x_train.shape[0], round(x_train.shape[1]/5), 5)
x_test = x_test.reshape(x_test.shape[0], round(x_test.shape[1]/5), 5)
print(x_train.shape)
input_shape = (round(x_test.shape[1]/5), 5)

model = Sequential()
model.add(Conv1D(100, 2, activation='relu', input_shape=input_shape))
model.add(MaxPooling1D(3))
model.add(Conv1D(100, 2, activation='relu'))
model.add(GlobalAveragePooling1D())
model.add(Dropout(0.5))
model.add(Dense(49, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    
model.fit(x_train, y_train, batch_size=64, epochs=2, validation_data=(x_test, y_test))
print(model.summary())

I get this error: enter image description here Print out of layers enter image description here

2 Answers2

0

You are using Conv1D, but trying, by reshaping, represent your data in 2D - that make a problem. Try to skip the part with reshaping, so your input will be a 1 row with 49 values:

x_train, x_test, y_train, y_test = train_test_split(np_train_data, np_train_target, random_state=0)
print(x_train.shape)
input_shape = (x_test.shape[1], 1)

model = Sequential()
model.add(Conv1D(100, 2, activation='relu', input_shape=input_shape))
model.add(MaxPooling1D(3))
model.add(Conv1D(100, 2, activation='relu'))
model.add(GlobalAveragePooling1D())
model.add(Dropout(0.5))
model.add(Dense(49, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

model.fit(x_train, y_train, batch_size=64, epochs=2, validation_data=(x_test, y_test))
Danylo Baibak
  • 2,106
  • 1
  • 11
  • 18
  • 1
    This is not correct. They are reshaping their data to 3D, batch x time x features, which is the format needed for `Conv1D`. – xdurch0 Jun 30 '20 at 08:43
  • @xdurch0 i was under that impression as well (conv1d must be represented in "2d" and conv2d must be represented in "3d") – the_begging_beginner Jun 30 '20 at 08:46
0

You are dividing by 5 twice. Here you are reshaping your data, which is necessary contrary to what the other answer says:

x_train = x_train.reshape(x_train.shape[0], round(x_train.shape[1]/5), 5)
x_test = x_test.reshape(x_test.shape[0], round(x_test.shape[1]/5), 5)

This already takes care of "dividing the time by 5". But here you are defining the input shape to the model, dividing by 5 again:

input_shape = (round(x_test.shape[1]/5), 5)

Simply use

input_shape = (x_test.shape[1], 5)

instead! Note that because this shape is called after the reshape, it already refers to the correct one, with the time dimension divided by 5.

xdurch0
  • 9,905
  • 4
  • 32
  • 38