Custom cross-entropy loss in pytorch

Question

I have done a custom implementation of the pytorch cross-entropy loss function (as I need more flexibility to be introduced later). The model I intend to train with this will need a considerable amount of time to train and the resources available can't be used to merely test if the function is correct implementation. I have implemented vectorized implementation as it will be quicker to run.

Following is my code for the same:

def custom_cross(my_pred,true,batch_size=BATCH_SIZE):
    loss= -torch.mean(torch.sum(true.view(batch_size, -1) * torch.log(my_pred.view(batch_size, -1)), dim=1))
    return loss

I will really appreciate if you can suggest a more optimized implementation of the same or if I am making a mistake in the present one. The model will use a Nvidia Tesla K-80 to train.

Cause `log(0)` is undefined, I think you need to add an epsilon(`1e-10`) to the prediction and target. — Craig.Li, Jun 19 '19 at 11:03
If `my_pred` is a result of a softmax, you can use [`LogSoftmax`](https://pytorch.org/docs/stable/nn.html#torch.nn.LogSoftmax) to avoid some exponentials. — Jindřich, Jun 19 '19 at 11:13
@Jindřich yes indeed my_pred is a result of a softmax but I am sorry I am not sure I follow how will that affect or a possible implementation, may I request a little more explanation, Thank you. — Inder, Jun 19 '19 at 11:35
If you do softmax `softmax = e^{X} / [sum e^{X}]`, you always get rounding errors from the division and you exponentiate `X` which is not the fastest operation. If you do `logsoftmax = X - log [sum e^{X}]`, you avoid the rounding errors, and get rid of exponentiating and logarithming of `X`. The logarithm of the `sum e^{X}` is as fast as just summing and without rounding errors because of a [clever implementation](https://pytorch.org/cppdocs/api/function_namespaceat_1ad31c5b66bf0f891cd05c1d78c6d10aa2.html), — Jindřich, Jun 19 '19 at 11:55
@Jindřich that is a great input also thank you for your detailed explanation. — Inder, Jun 19 '19 at 11:58

score 4 · Accepted Answer · answered Jun 20 '19 at 17:28

If you need just cross entropy you can take the advantage PyTorch defined that.

import torch.nn.functional as F
loss_func = F.cross_entropy

suggest a more optimized implementation

PyTorch has F. loss functions, but you can easily write your own using plain python. PyTorch will create fast GPU or vectorized CPU code for your function automatically.

So, you may check the PyTorch original implementation but I think is this:

def log_softmax(x):
    return x - x.exp().sum(-1).log().unsqueeze(-1)

And here is the original implementation of cross entropy loss, now you may just alter:

nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)

To something you need, and you have it.

Custom cross-entropy loss in pytorch

1 Answers1