0

I'm trying to implement a simple image to text network. For every image the NN has an output of 5 character and there are 23 possible characters, so my labels are 5x23. When I try to fit the model I get the following error

ValueError: Data cardinality is ambiguous:
(...)

Passing just one example, the message is this

ValueError: Data cardinality is ambiguous:
  x sizes: 1
  y sizes: 5
Please provide data which shares the same first dimension.

How can I properly train the network for this task?

The model is the following

input_layer = Input(shape=(h, w, 1))
x = Conv2D(conv_filters, kernel_size, activation='relu')(input_layer)
x = MaxPooling2D((pool_size,pool_size))(x)
x = Conv2D(conv_filters, kernel_size, activation='relu')(x)
x = MaxPooling2D((pool_size,pool_size))(x)

x = Reshape(target_shape=(5,-1))(x)
x = Dense(5, activation='swish')(x)

fw = GRU(128, return_sequences=True, kernel_initializer='he_normal')(x)
bw = GRU(128, return_sequences=True, go_backwards=True, kernel_initializer='he_normal')(x)

bgru = add([fw, bw])
output = Dense(n_tokens, activation='softmax')(bgru)
waaat
  • 95
  • 2
  • 6
  • Please show your input data – Nicolas Gervais Dec 11 '20 at 13:46
  • The input is a 47x90 black and white image with 1 channel – waaat Dec 11 '20 at 14:37
  • here is an example https://ibb.co/xHjC6ry – waaat Dec 11 '20 at 17:59
  • First dimension of `x `and `y` is different. First dimension indicates the `batch size` and it should be same. Please ensure that `y` also has the shape `(1, something)`. Please refer [this](https://stackoverflow.com/a/62261086/14290681) may help you. Thanks! –  Jan 16 '21 at 13:45

0 Answers0