How to create Hybrid loss consisting from dice loss and focal loss [Python]

Question

I'm trying to implement the Multiclass Hybrid loss function in Python from following article https://arxiv.org/pdf/1808.05238.pdf for my semantic segmentation problem using an imbalanced dataset. I managed to get my implementation correct enough to start while training the model, but the results are very poor. Model architecture - U-net, learning rate in Adam optimizer is 1e-5. Mask shape is (None, 512, 512, 3), with 3 classes (in my case forest, deforestation, other). The formula I used to implement my loss:

The code I created:

def build_hybrid_loss(_lambda_=1, _alpha_=0.5, _beta_=0.5, smooth=1e-6):
    def hybrid_loss(y_true, y_pred):
        C = 3
        tversky = 0
        # Calculate Tversky Loss
        for index in range(C):
            inputs_fl = tf.nest.flatten(y_pred[..., index])
            targets_fl = tf.nest.flatten(y_true[..., index])
        
            #True Positives, False Positives & False Negatives
            TP = tf.reduce_sum(tf.math.multiply(inputs_fl, targets_fl))
            FP = tf.reduce_sum(tf.math.multiply(inputs_fl, 1-targets_fl[0]))
            FN = tf.reduce_sum(tf.math.multiply(1-inputs_fl[0], targets_fl))
           
            tversky_i = (TP + smooth) / (TP + _alpha_ * FP + _beta_ * FN + smooth)  
            tversky += tversky_i
        tversky += C
        
        # Calculate Focal loss
        loss_focal = 0
        for index in range(C):
            f_loss = - (y_true[..., index] * (1 - y_pred[..., index])**2 * tf.math.log(y_pred[..., index]))
            # Average over each data point/image in batch
            axis_to_reduce = range(1, 3)
            f_loss = tf.math.reduce_mean(f_loss, axis=axis_to_reduce)
            loss_focal += f_loss
            
        result = tversky + _lambda_ * loss_focal
        return result
    return hybrid_loss

The prediction of the model after the end of an epoch (I have a problem with swapped colors, so the red in the prediction is actually green, which means forest, so the prediction is mostly forest and not deforestation):

The question is what is wrong with my hybrid loss implementation, what needs to be changed to make it work?

`I have a problem with swapped colors` : if you are using opencv, note that the default color space is `BGR`, while other image manipulation libraries tends to work with `RGB`. — Lescurel, Feb 03 '21 at 08:44
@Lescurel, when I started the project I made my Class 1 = Forest (color green) and the `prediction[..., 0]` is the predicted mask for the forest. But when I use `plt.imshow(prediction)` the prediction has three channels just like the RGB image and so the first mask (forest mask) is red. This is now a side issue. — Петр Воротинцев, Feb 03 '21 at 09:06

score 2 · Answer 1 · answered Feb 03 '21 at 17:43

To simplify things a little, I have divided the Hybrid loss into four separate functions: Tversky's loss, Dice coefficient, Dice loss, Hybrid loss. You can see the code below.

def TverskyLoss(targets, inputs, alpha=0.5, beta=0.5, smooth=1e-16, numLabels=3):
    tversky = 0
    for index in range(numLabels):
        inputs_fl = tf.nest.flatten(inputs[..., index])
        targets_fl = tf.nest.flatten(targets[..., index])

        #True Positives, False Positives & False Negatives
        TP = tf.reduce_sum(tf.math.multiply(inputs_fl, targets_fl))
        FP = tf.reduce_sum(tf.math.multiply(inputs_fl, 1-targets_fl[0]))
        FN = tf.reduce_sum(tf.math.multiply(1-inputs_fl[0], targets_fl))
       
        tversky_i = (TP + smooth) / (TP + alpha*FP + beta*FN + smooth)  
        tversky += tversky_i
    return numLabels - tversky

def dice_coef(y_true, y_pred, smooth=1e-16):
    y_true_f = tf.nest.flatten(y_true)
    y_pred_f = tf.nest.flatten(y_pred)
    intersection = tf.math.reduce_sum(tf.math.multiply(y_true_f, y_pred_f))
    return (2. * intersection + smooth) / (tf.math.reduce_sum(y_true_f) + tf.math.reduce_sum(y_pred_f) + smooth)

def dice_coef_multilabel(y_true, y_pred, numLabels=3):
    dice=0
    for index in range(numLabels):
        dice -= dice_coef(y_true[..., index], y_pred[..., index])
    return numLabels + dice

def build_hybrid_loss(_lambda_=0.5, _alpha_=0.5, _beta_=0.5, smooth=1e-16, C=3):
    def hybrid_loss(y_true, y_pred):
        tversky = TverskyLoss(y_true, y_pred, alpha=_alpha_, beta=_beta_)
        dice = dice_coef_multilabel(y_true, y_pred)    
        result = tversky + _lambda_ * dice
        return result
    return hybrid_loss

Adding the loss=build_hybrid_loss() during model compilation will add Hybrid loss as the loss function of the model.

After a short research, I came to the conclusion that in my particular case, a Hybrid loss with _lambda_ = 0.2, _alpha_ = 0.5, _beta_ = 0.5 would not be much better than a single Dice loss or a single Tversky loss. Neither IoU (intersection over union) nor the standard accuracy metric are much better with Hybrid loss. But I believe it is not a rule of thumb that such a Hybrid loss will be worser or at the same level of performance as single loss at all cases.

link to Accuracy graph

link to IoU graph

How to create Hybrid loss consisting from dice loss and focal loss [Python]

1 Answers1