In machine learning and information theory, the cross entropy is a measure of distance (inverse similarity) between two probability distributions over the same underlying set of events. Cross entropy is the common choice of the loss function in neural networks for classification tasks.
Questions tagged [cross-entropy]
360 questions
3
votes
0 answers
Why is MSE loss working better for multi-class classification problem than categorical crossentropy?
I have built a model with multiple heads, some doing regression and some classification. I then sum up all the losses in a weighted manner to backpropagate.
For the classification heads, I use the one-hot encoding approach, and code takes argmax on…

Pratik Kayal
- 31
- 1
3
votes
1 answer
pos_weight in binary cross entropy calculation
When we deal with imbalanced training data (there are more negative samples and less positive samples), usually pos_weight parameter will be used.
The expectation of pos_weight is that the model will get higher loss when the positive sample gets the…

Wu Shiauthie
- 69
- 1
- 9
3
votes
3 answers
How to calculate correct Cross Entropy between 2 tensors in Pytorch when target is not one-hot?
I am confused about the calculation of cross entropy in Pytorch. If I want to calculate the cross entropy between 2 tensors and the target tensor is not a one-hot label, which loss should I use? It is quite common to calculate the cross entropy…

Zhongzheng_11
- 181
- 1
- 10
3
votes
1 answer
Pytorch crossentropy loss with 3d input
I have a network which outputs a 3D tensor of size (batch_size, max_len, num_classes). My groud truth is in the shape (batch_size, max_len). If I do perform one-hot encoding on the labels, it'll be of shape (batch_size, max_len, num_classes) i.e the…

Hari Krishnan
- 2,049
- 2
- 18
- 29
3
votes
1 answer
Not binary ground truth labels in binary crossentropy?
Does it make sense to use not binary ground truth values for binary crossentropy? is there any formal proof?
Looks like it used in practice: for example in https://blog.keras.io/building-autoencoders-in-keras.html, i.e. mnist images are not binary,…

mrgloom
- 20,061
- 36
- 171
- 301
3
votes
1 answer
Logic behind choosing weight for weighted loss calculation?
What is the general logic behind choosing the weight for calculating weighted sigmoid cross-entropy loss, or for any weighted loss in case of an imbalanced dataset? The problem domain is based on vision/image classification.

Solaiman Salvi
- 577
- 4
- 9
3
votes
2 answers
Derivative in both arguments of torch.nn.BCELoss()
When using a torch.nn.BCELoss() on two arguments that are both results of some earlier computation, I get some curious error, which this question is about:
RuntimeError: the derivative for 'target' is not implemented
The MCVE is as follows:
import…

flawr
- 10,814
- 3
- 41
- 71
3
votes
1 answer
TensorFlow model gets zero loss
import tensorflow as tf
import numpy as np
import os
import re
import PIL
def read_image_label_list(img_directory, folder_name):
# Input:
# -Name of folder (test\\\\train)
# Output:
# -List of names of files in folder
# …

Tharuka Devendra
- 33
- 3
3
votes
2 answers
what's the difference between softmax_cross_entropy_with_logits and losses.log_loss?
whats the primary difference between tf.nn.softmax_cross_entropy_with_logits and tf.losses.log_loss? both methods accept 1-hot labels and logits to calculate cross entropy loss for classification tasks.

mynameisvinn
- 341
- 4
- 10
3
votes
1 answer
Error when checking target : sparse_categorical_crossentropy output shape
I am attempting to train InceptionV3 on a novel set of images using transfer learning. I am running into this issue - which clearly relates to a mismatch of input and output dimension (I think) but I can't seem to identify the issue). All relevant…

GhostRider
- 2,109
- 7
- 35
- 53
3
votes
2 answers
How does Keras deal with log(0) for categorical cross entropy?
I have a neural network, trained on MNIST, with categorical cross entropy as its loss function.
For theoretical purposes my output layer is ReLu. Therefore a lot of
its outputs are 0.
Now I stumbled across the following question:
Why don't I get a…

snoozzz
- 135
- 1
- 1
- 7
2
votes
2 answers
How to calculate cross entropy in keras for target values that aren't 0 or 1
Rather than training a neural network to output 1 or 0 through the output sigmoid layer, LeCun recommends (in the paper "Efficient BackProp" - LeCun et al, 1998, section 4.5):
Choose target values at the point of the maximum second derivative on…

JMS
- 1,039
- 4
- 12
- 20
2
votes
1 answer
Why does the pytorch crossEntropyLoss use label encoding, instead of one-hot encoding?
I'm learning on CrossEntropyLoss module in pytorch.
And the tutor says, you should input target value y with 'label encoded', not 'one-hot encoded'.
Like this
loss = nn.CrossEntropyLoss()
Y = torch.tensor([0])
Y_pred_good = torch.tensor([[2.0, 1.0,…

yubin
- 125
- 8
2
votes
1 answer
Different cross entropy results from NumPy and PyTorch
My prediction is y_hat = [ 0.57,0.05,0.14,0.10,0.14] and target is
target =[ 1, 0, 0, 0, 0 ].
I need to calculate Cross Entropy loss by NumPy and Pytorch loss function.
Using NumPy my formula is -np.sum(target*np.log(y_hat)), and I got…

hello m
- 21
- 2
2
votes
1 answer
CrossEntropyLoss showing poor accuracy on 2d output
I'm trying some experiments on a simple neural network that just tries to learn the squares of some random numbers, represented as arrays of decimal digits, code copied below, with changes indicated by comments.
The version using nn.Softmax(dim=2)…

rwallace
- 31,405
- 40
- 123
- 242