0

I am having trouble getting this model to compile.

I am trying to implement a VGG16 but I will be using a custom loss function. The target variable has a shape of (?, 14, 14, 9, 6) where we only use binary crossentropy on Y_train[:,:,:,:,0] then Y_train[:,:,:,:,1] as a switch to turn off the loss effectively making this a mini-batch -- the others will be used on a separate branch of the neural net. This is a binary classification problem on this branch so I only want to have output of shape (?, 14, 14, 9, 1).

I have listed my error below. Can you please explain firstly what is going wrong and secondly how to mitigate this issue?

Model code

img_input = Input(shape = (224,224,3))

x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)

# # Block 2
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)

# Block 3
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)

# # Block 4
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)

# # Block 5
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(x)

x = Conv2D(512, (3, 3), padding='same', activation='relu', kernel_initializer='normal', name='rpn_conv1')(x)

x_class = Conv2D(9, (1, 1), activation='sigmoid', kernel_initializer='uniform', name='rpn_out_class')(x)

x_class = Reshape((14,14,9,1))(x_class)
model = Model(inputs=img_input, outputs=x_class)
model.compile(loss=rpn_loss_cls(), optimizer='adam')

Loss function code:

def rpn_loss_cls(lambda_rpn_class=1.0, epsilon = 1e-4):

    def rpn_loss_cls_fixed_num(y_true, y_pred):
        return lambda_rpn_class * K.sum(y_true[:,:,:,:,0] 
                                * K.binary_crossentropy(y_pred[:,:,:,:,:], y_true[:,:,:,:,1]))
                                / K.sum(epsilon + y_true[:,:,:,:,0])
    return rpn_loss_cls_fixed_num

Error:

ValueError: logits and labels must have the same shape ((?, ?, ?, ?) vs (?, 14, 14, 9, 1))

Note: I have read multiple question on this site having the same error, but none of the solutions allowed my model to compile.

Potential solution:

I continued messing with this and found that by adding

y_true = K.expand_dims(y_true, axis=-1)

I was able to compile the model. Still dubious that this is going to work correctly.

Soroush
  • 1,055
  • 2
  • 18
  • 26
  • Your model input has 3 dimensions, and your trying to get 4 dimensional data out of it. What is your input size? Are you sure it could be broadcasted to your output size? – Sharky Apr 09 '19 at 15:17
  • Image data with channels = `(?, 224, 224, 3). Can you not project the input to a higher dimension? – Collin Cunningham Apr 09 '19 at 15:24
  • What do you mean by 'Can you not project'? It's not completely clear what you're trying to accomplish. Do you want to input image of shape(224, 224, 3) and get output of shape(14, 14, 9, 1)? – Sharky Apr 09 '19 at 15:34
  • Yes exactly. That is what I was trying to do with the reshape. – Collin Cunningham Apr 09 '19 at 15:58

1 Answers1

0

Keras model set y_true shape equivalent to input shape. Therefore, when your loss function gets shape mismatch error. So you need to align dimensions by using expand_dims. This, however, needs to be done considering your model architecture, data and loss function. Code below will compile.

def rpn_loss_cls(lambda_rpn_class=1.0, epsilon = 1e-4):

    def rpn_loss_cls_fixed_num(y_true, y_pred):
        y_true = tf.keras.backend.expand_dims(y_true, -1)
        return lambda_rpn_class * K.sum(y_true[:,:,:,:,0] 
                                * K.binary_crossentropy(y_pred[:,:,:,:,:], y_true[:,:,:,:,1]))
                                / K.sum(epsilon + y_true[:,:,:,:,0])
    return rpn_loss_cls_fixed_num
Sharky
  • 4,473
  • 2
  • 19
  • 27
  • Could you describe some of the issues that could arise from this transformation? – Collin Cunningham Apr 09 '19 at 17:10
  • I mean that you need to know what you're trying to do with this transformation. It should be differentiable, so no immediate problems should arise. But loss function is in general a leaky concept, and this is well beyond scope of this site. Hope this helps! – Sharky Apr 09 '19 at 17:36
  • But how would adding a dimension change differentiability when it is not using that dimension. I.e. expand_dims` does not directly affect smoothness. – Collin Cunningham Apr 09 '19 at 18:41
  • I meant loss function in general. Adding dimension won't affect it of course. – Sharky Apr 09 '19 at 18:51