I have an output tensor (both target and predicted) of dimension (32 x 8 x 5000). Here, the batch size is 32, the number of classes is 5000 and the number of points per batch is 8. I want to calculate CELoss on this in such a way that, the loss is computed for every point (across 5000 classes) and then averaged across the 8 points. How can I do this?
For clarity, there are 32 batch points in a batch (for bs=32). Each batch point has 8 vector points, and each vector point has 5000 classes. For a given batch, I wish to compute CELoss across all (8) vector points, compute their average and do so for all the batch points (32).
Let me know if my question isn’t clear or ambiguous.
For example:
op = torch.rand((4,3,5))
gt = torch.tensor([
[[0,1,1,0,0],[0,0,1,0,0],[1,1,0,0,1]],
[[1,1,0,0,1],[0,0,0,1,0],[0,0,1,0,0]],
[[0,0,1,0,0],[1,1,1,1,0],[1,1,0,0,1]],
[[1,1,0,0,1],[1,1,0,0,1],[1,0,0,0,0]]
])