0

I am implementing my own code using keras to do semantic segmentation. My testing images has shape (10, 512, 512, 5), where 10 is the number of images, 512 is their size and 5 is the number of classes I want to segment. As last activation function I use softmax and as loss I want to extract the dice loss (https://arxiv.org/abs/1606.04797) in order to improve the segmentation results. My code is:

eps = 1e-3

def dice(y_true, y_pred):
    y_pred = K.one_hot(K.argmax(y_pred,axis=-1), Nclasses) 
    y_true_f = K.flatten(y_true)
    y_pred_f = K.flatten(y_pred)
    num = 2*K.sum(y_true_f*y_pred_f)
    den = K.sum(K.square(y_true_f))+K.sum(K.square(y_pred_f))+eps
    return num/den

def dice_loss(y_true, y_pred):
    return 1-dice(y_true, y_pred)

I use K.one_hot(K.argmax(...)) because in this way my y_pred is binary and not made by probabilities (right?). Anyway, when the training process starts, I receive this error:

"ValueError: An operation has None for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval."
xrisk
  • 3,790
  • 22
  • 45
  • What are the dimensions of `y_true` and `y_pred`? The paper you have cited computes dice loss over volumes. – Vlad May 02 '19 at 17:57

2 Answers2

0

This post seems to indicate that since argmax does not have a gradient in keras, you will not be able to use it in your custom loss function.

Abhineet Gupta
  • 624
  • 4
  • 12
0

Try using this code snippet for your dice coefficient. Important observation : If you have your masks one-hot-encoded, this code should also work for multi-class segmentation.

smooth = 1.

def dice_coef(y_true, y_pred):
    y_true_f = K.flatten(y_true)
    y_pred_f = K.flatten(y_pred)
    intersection = K.sum(y_true_f * y_pred_f)
    return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)


def dice_coef_loss(y_true, y_pred):
    return -dice_coef(y_true, y_pred)
Timbus Calin
  • 13,809
  • 5
  • 41
  • 59
  • thanks for your reply. I had a similar code before but I was wondering if I should do something on that y_pred, because I think it is made by probabilities and not 0s and 1s. Anyway, I use `to_categorical` for my ground-truth, so that one of them is (512, 512, 5), where 5 is the number of classes and all of those images are binary. – Marco Domenico Cirillo May 03 '19 at 15:36
  • Yes, exactly. Since all the images are binary and you have a 3D-Block with width x height x number_of_classes, where each slice of a block width x height x 1 corresponds to a class, you have only 0's and 1's. So that is why the code should work :D, exactly like you said. – Timbus Calin May 15 '19 at 07:13