I am trying to build a very simple OCR for start my tests on bigger models. The problem here is that I can't figure out how should be my output data for my training
code:
def simple_model():
output = 28
if K.image_data_format() == 'channels_first':
input_shape = (1, input_height, input_width)
else:
input_shape = (input_height, input_width, 1)
conv_to_rnn_dims = (input_width // (2), (input_height // (2)) * conv_blades)
model = Sequential()
model.add(Conv2D(conv_blades, (3, 3), input_shape=input_shape, padding='same'))
model.add(MaxPooling2D(pool_size=(2,2), name='max2'))
model.add(Reshape(target_shape=conv_to_rnn_dims, name='reshape'))
model.add(GRU(64, return_sequences=True, kernel_initializer='he_normal', name='gru1'))
model.add(TimeDistributed(Dense(output, kernel_initializer='he_normal', name='dense2')))
model.add(Activation('softmax', name='softmax'))
model.compile(loss='mse',
optimizer='adamax',
metrics=["accuracy"])
return model
img = load_img('exit.png', grayscale=True, target_size=[input_height, input_width])
x = img_to_array(img)
x = x.reshape((1,) + x.shape)
y = np.array(['exit'])
model = simple_model()
model.fit(x, y, batch_size=1,
epochs=10,
validation_data=(x, y),
verbose=1)
print model.predict(y)
Image Example:
(source: exitfest.org)
When I run this code, I get the following error:
ValueError: Error when checking target: expected softmax to have 3 dimensions, but got array with shape (1, 1)
Note 1: I know I can't train my model with only one image and one label, I am aware and I have a bunch more images like that, but first I need to run this simple model before improve it.
Note 2: this is the first time I work with Image-to-Sequence output, it may have other problems, so feel free to change the code if there is this kind of mistake.