I am using keras to create a model using the url https://github.com/tensorflow/workshops/blob/master/extras/keras-bag-of-words/keras-bow-model.ipynb as the reference/guide.
The only difference in the approach between the one mentioned in the above URL and my approach is in encoding.
My vocabulary has only 35 words. So, I do a one hot encoding for each word like below
[0000001000.....00] //column length is 35
So, my sentence with 3 words after encoded will look like
[[[0 0 0 ... 0 0 0 1]]
[[0 0 0 ... 0 1 0 0]]
[[0 0 0 ... 0 0 1 0]]]
and the shape looks like this (3, 1, 36)
I wish to pass thousands of encoded sentence for training and each of the sentence has different size of words. When I print the shape of the encoded sentence, I got (3, 1, 36) (4, 1, 36) (7, 1, 36) (4, 1, 36) (9, 1, 36) (3, 1, 36) . . . . (5, 1, 36)
Now, I need to pass this to the model
I created a model as below
batch_size = 32
epochs = 2
max_words = 1000
# Build the model
model = Sequential()
model.add(Dense(512, input_shape=(max_words,)))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
# model.fit trains the model
# The validation_split param tells Keras what % of our training data should
be used in the validation set
# You can see the validation loss decreasing slowly when you run this
# Because val_loss is no longer decreasing we stop training to prevent overfitting
history = model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_split=0.1,
validation_data=(x_test, y_test))
After running this, I got an error message Error when checking input: expected dense_71_input to have 2 dimensions, but got array with shape (3, 1, 36)
I am sure I am doing something fundamentally wrong. But not sure how to fix this as I am a novice user and extensive search too didnt help me.
Any help in resolving this will be a huge help for me. Thanks