0

i am making a mode that the prediction is a metrix from a conv layer. my loss function is

def custom_loss(y_true, y_pred):
    print("in loss...")
    final_loss = float(0)
    print(y_pred.shape)
    print(y_true.shape)
    for i in range(7):
        for j in range(14):
            tl = float(0)
            gt = y_true[i,j]
            gp = y_pred[i,j]
            if gt[0] == 0:
                tl = K.square(gp[0] - gt[0])
            else:
                for l in range(5):
                    tl = tl + K.square(gp[l] - gt[l])/5
            final_loss = final_loss + tl/98
    return final_loss

the shapes that printed out from the arguments are

(?, 7, 14, 5)

(?, ?, ?, ?)

the labels are in the shape of 7x14x5.

it seems like the loss function gets called for a list of predictions instead of one prediction at a time. I am relatively new to Keras and don't really understand how these things work.

this is my model

model = Sequential()
input_shape=(360, 640, 1)

model.add(Conv2D(24, (5, 5), strides=(1, 1), input_shape=input_shape))
model.add(MaxPooling2D((2,4), strides=(2, 2)))

model.add(Conv2D(48, (5, 5), padding="valid"))
model.add(MaxPooling2D((2,4), strides=(2, 2)))

model.add(Conv2D(48, (5, 5), padding="valid"))
model.add(MaxPooling2D((2,4), strides=(2, 2)))

model.add(Conv2D(24, (5, 5), padding="valid"))
model.add(MaxPooling2D((2,4), strides=(2, 2)))

model.add(Conv2D(5, (5, 5), padding="valid"))
model.add(MaxPooling2D((2,4), strides=(2, 2)))


model.compile(
    optimizer="Adam",
    loss=custom_loss,
    metrics=['accuracy'])

print(model.summary())

I am getting an error like

ValueError: slice index 7 of dimension 1 out of bounds. for 'loss/max_pooling2d_5_loss/custom_loss/strided_slice_92' (op: 'StridedSlice') with input shapes: [?,7,14,5], [2], [2], [2] and with computed input tensors: input[1] = <0 7>, input[2] = <1 8>, input[3] = <1 1>.

I think I know this is because of the arguments to the loss function is given in many predictions at a time with 4D.

how can I fix? is the problem in the way I assign the loss function or in the loss function. for now, the output of the loss function is a float. but what is it supposed to be.

Eshaka
  • 974
  • 1
  • 14
  • 38
  • i referred to this to get the idea, https://stackoverflow.com/questions/41707621/keras-mean-squared-error-loss-layer – Eshaka Dec 19 '19 at 03:06
  • one other thing is I don't see anyone use loops in the loss function, why is that, does it has something to do with the speed. how can an implement this function in an efficient way without using loops – Eshaka Dec 19 '19 at 03:39
  • can you verbally explain what you need to do in your loss function? It's not really clear – thushv89 Dec 20 '19 at 23:50

1 Answers1

1

To answer some of your concerns,

I don't see anyone use loops in the loss function

Usually it's a pretty bad practice. Deep nets train on millions of samples usually. Having loops instead of using vectorized operations therefore, will really bring down your model performance.

Implementing without loops.

I'm not sure if I've exactly captured what you wanted in your loss function. But I'm quite sure it's very close (if not this is what you needed). I could have compared your loss against mine with fixed random seeds to see if I get exactly the result given by your loss function. However, since your loss is not working, I can't do that.

def custom_loss_v2(y_true, y_pred):
  # We create MSE loss that captures what's in the else condition -> shape [batch_size, height, width]
  mse = tf.reduce_mean((y_true-y_pred)**2, axis=-1)

  # We create pred_first_ch tensor that captures what's in the if condition -> shape [batch, height, width]
  pred_first_ch = tf.gather(tf.transpose(y_pred**2, [3,0,1,2]),0)

  # We create this to get a boolean array that satisfy the conditions in the if else statement
  true_first_zero_mask = tf.equal(tf.gather(tf.transpose(y_true, [3,0,1,2]),0), 0)

  # Then we use tf.where with reduce_mean to get the final loss
  res = tf.where(true_first_zero_mask, pred_first_ch, mse)
  return tf.reduce_mean(res)
thushv89
  • 10,865
  • 1
  • 26
  • 39
  • thanks for this answer, i am going to need to study every single line of your answer to see this is what i want. most importantly i want to do learn how i can do this, so thanks. i will get back to you after studying this – Eshaka Dec 21 '19 at 08:39
  • can you take a look at the second part(create the first channel tensor) in you code, it seems like you are only accessing the data in y_pred, but mind i use both, i am was trying to do it get the loss of all square error of all five variables if the first one in y_pred is 0, (if the first value is 0 then the rest is no use, if the first value is 1 then consider all five variables. – Eshaka Dec 21 '19 at 09:00
  • i was trying to implement a basic yolo algorithm according this video, you can take a look, it explains the loos function that i am trying to implement. https://www.youtube.com/watch?v=GSwYGkTfOKk&list=PL_IHmaMAvkVxdDOBRg2CbcJBq9SY7ZUvs&index=2&t=0s – Eshaka Dec 21 '19 at 09:00
  • @Eshaka yes you can have both, but I got rid of `y_true` because, `y_true` (in your case `gt[0]` is zero. ( y_pred - 0) is y_pred. – thushv89 Dec 21 '19 at 09:15
  • i think you have a rough idea about this model now, what about the metrics i should use for this, 'accuracy is not suitable right' – Eshaka Dec 23 '19 at 05:38
  • why did u use reduce mean – Eshaka Dec 23 '19 at 06:41
  • @Eshaka, because it's a loss. And usually you average the loss across the batch to compute gradients. But this depends on what you exactly want too. – thushv89 Dec 23 '19 at 11:16
  • thanks, can you refer me some tutorial that helps understand how to convert loops to vectors. – Eshaka Dec 24 '19 at 09:26