1

I am trying to build a very simple OCR for start my tests on bigger models. The problem here is that I can't figure out how should be my output data for my training

code:

def simple_model():

    output = 28

    if K.image_data_format() == 'channels_first':
        input_shape = (1, input_height, input_width)
    else:
        input_shape = (input_height, input_width, 1)

    conv_to_rnn_dims = (input_width // (2), (input_height // (2)) * conv_blades)

    model = Sequential()
    model.add(Conv2D(conv_blades, (3, 3), input_shape=input_shape, padding='same'))
    model.add(MaxPooling2D(pool_size=(2,2), name='max2'))
    model.add(Reshape(target_shape=conv_to_rnn_dims, name='reshape'))
    model.add(GRU(64, return_sequences=True, kernel_initializer='he_normal', name='gru1'))
    model.add(TimeDistributed(Dense(output, kernel_initializer='he_normal', name='dense2')))
    model.add(Activation('softmax', name='softmax'))

    model.compile(loss='mse',
                optimizer='adamax',
                metrics=["accuracy"])

    return model

img = load_img('exit.png', grayscale=True, target_size=[input_height,     input_width])  
x = img_to_array(img)  
x = x.reshape((1,) + x.shape)  

y = np.array(['exit'])

model = simple_model()

model.fit(x, y, batch_size=1,
                    epochs=10,
                    validation_data=(x, y),
                    verbose=1)

print model.predict(y)

Image Example:

Image
(source: exitfest.org)

When I run this code, I get the following error:

ValueError: Error when checking target: expected softmax to have 3 dimensions, but got array with shape (1, 1)

Note 1: I know I can't train my model with only one image and one label, I am aware and I have a bunch more images like that, but first I need to run this simple model before improve it.

Note 2: this is the first time I work with Image-to-Sequence output, it may have other problems, so feel free to change the code if there is this kind of mistake.

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Claudio
  • 1,987
  • 3
  • 29
  • 55
  • It sounds like something wrong with the shape of input that goes into the model. – webdizz Aug 10 '17 at 15:41
  • Ok so how it should be? – Claudio Aug 10 '17 at 16:53
  • Sorry it's just a hypothesis, do not know yet how it should be cause I'm in progress of this topic – webdizz Aug 10 '17 at 17:23
  • Spent some time trying to figure out how to adapt softmax activation, but had no luck for now. However there's an example https://github.com/fchollet/keras/blob/master/examples/image_ocr.py which is pretty similar to what you want to do. – webdizz Aug 11 '17 at 16:28
  • The example on keras/examples uses CTC in the end, I want to build one neural network without CTC because I don't understand it. I have seen in some pappers it improves the precision and recall but I think it is not time for me to use it. – Claudio Aug 11 '17 at 18:35
  • Well, as I haven't received any answer, I will link to the answer [I posted in another question](https://stackoverflow.com/questions/44847446/how-can-i-use-the-keras-ocr-example/49537697#49537697) – Claudio Jun 26 '18 at 00:49

1 Answers1

0

Well, as I haven't received any answer, I will link to the answer I posted in another question

Here I explain how to use the keras OCR example and answer some other questions.

Claudio
  • 1,987
  • 3
  • 29
  • 55