I use the next simple implementation of dice loss:
def dice_coef(y_true, y_pred):
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
intersection = K.sum(y_true_f * y_pred_f)
return (2. * intersection + 1.) / (K.sum(y_true_f) + K.sum(y_pred_f) + 1.)
def MyModelLoss(y_true, y_pred):
# shape: samples, width, height, 2 classes (0..1)
return 1 - dice_coef(y_true, y_pred)
I want to try the next variations:
mean per batch (current implementation don't take into consideration the internal structure of results... I'm not sure, but maybe mean dice coefficient would be more relevant)
calculate dice coefficient for 32*32 blocks and take minimal... Again, I'm not sure, but maybe it would be more relevant.
I tried to implement mean loss by simple dice_coef(y_true, y_pred) / y_true.shape[0]
(which isn't correct, I know), but even this cause exception ValueError: None values not supported.
. I understand that Keras trying to find out information about loss function and calling it on the empty batch, but what would be correct implementation than?
Partially solved in next way:
def dice_coef(y_true, y_pred, smooth=1):
#considering shape (batch, image_size, image_size, classes)
axis = [1, 2]
intersection = K.sum(y_true * y_pred, axis=axis)
sums = K.sum(y_true, axis=axis) + K.sum(y_pred, axis=axis)
# standard dice
dice = (2. * intersection + smooth) / (sums + smooth)
# min dice over the classes
minClassDice = K.min(dice, axis=-1)
# mean over all samples in the batch
return K.mean(minClassDice, axis=-1)
I'm not sure that it correctly works (I could confuse axis), but the accuracy of the model is greatly improved. But there is a drawback: the network can "stuck", because of min
may hide noticeable changes.