I have a machine learning course, where I have to implement the forward and backward method of the CELoss:
class CELoss(object):
@staticmethod
def forward(x, y):
assert len(x.shape) == 2 # x is batch of predictions (batch_size, 10)
assert len(y.shape) == 1 # y is batch of target labels (batch_size,)
# TODO implement cross entropy loss averaged over batch
return
@staticmethod
def backward(x, y, dout):
# TODO implement dx
dy = 0.0 # no useful gradient for y, just set it to zero
return dx, dy
Moreover, I am given the CELoss as
CELoss(x,y) = - log\frac{exp(x_y)}{\sum_{k}exp(x_k)}
(it says I cannot use the formula creator because I need to have at least 10 reputations)
This, however is not the CELoss that you can find on wikipedia for example (https://en.wikipedia.org/wiki/Cross_entropy). From my understanding, the CELoss takes targets and predictions. Are x representing the targets here and y are the predictions? If so, what is x_y referring to? Thank you for your help!