Questions tagged [cross-entropy]

In machine learning and information theory, the cross entropy is a measure of distance (inverse similarity) between two probability distributions over the same underlying set of events. Cross entropy is the common choice of the loss function in neural networks for classification tasks.

360 questions
3
votes
0 answers

Why is MSE loss working better for multi-class classification problem than categorical crossentropy?

I have built a model with multiple heads, some doing regression and some classification. I then sum up all the losses in a weighted manner to backpropagate. For the classification heads, I use the one-hot encoding approach, and code takes argmax on…
3
votes
1 answer

pos_weight in binary cross entropy calculation

When we deal with imbalanced training data (there are more negative samples and less positive samples), usually pos_weight parameter will be used. The expectation of pos_weight is that the model will get higher loss when the positive sample gets the…
Wu Shiauthie
  • 69
  • 1
  • 9
3
votes
3 answers

How to calculate correct Cross Entropy between 2 tensors in Pytorch when target is not one-hot?

I am confused about the calculation of cross entropy in Pytorch. If I want to calculate the cross entropy between 2 tensors and the target tensor is not a one-hot label, which loss should I use? It is quite common to calculate the cross entropy…
3
votes
1 answer

Pytorch crossentropy loss with 3d input

I have a network which outputs a 3D tensor of size (batch_size, max_len, num_classes). My groud truth is in the shape (batch_size, max_len). If I do perform one-hot encoding on the labels, it'll be of shape (batch_size, max_len, num_classes) i.e the…
Hari Krishnan
  • 2,049
  • 2
  • 18
  • 29
3
votes
1 answer

Not binary ground truth labels in binary crossentropy?

Does it make sense to use not binary ground truth values for binary crossentropy? is there any formal proof? Looks like it used in practice: for example in https://blog.keras.io/building-autoencoders-in-keras.html, i.e. mnist images are not binary,…
mrgloom
  • 20,061
  • 36
  • 171
  • 301
3
votes
1 answer

Logic behind choosing weight for weighted loss calculation?

What is the general logic behind choosing the weight for calculating weighted sigmoid cross-entropy loss, or for any weighted loss in case of an imbalanced dataset? The problem domain is based on vision/image classification.
3
votes
2 answers

Derivative in both arguments of torch.nn.BCELoss()

When using a torch.nn.BCELoss() on two arguments that are both results of some earlier computation, I get some curious error, which this question is about: RuntimeError: the derivative for 'target' is not implemented The MCVE is as follows: import…
flawr
  • 10,814
  • 3
  • 41
  • 71
3
votes
1 answer

TensorFlow model gets zero loss

import tensorflow as tf import numpy as np import os import re import PIL def read_image_label_list(img_directory, folder_name): # Input: # -Name of folder (test\\\\train) # Output: # -List of names of files in folder # …
3
votes
2 answers

what's the difference between softmax_cross_entropy_with_logits and losses.log_loss?

whats the primary difference between tf.nn.softmax_cross_entropy_with_logits and tf.losses.log_loss? both methods accept 1-hot labels and logits to calculate cross entropy loss for classification tasks.
mynameisvinn
  • 341
  • 4
  • 10
3
votes
1 answer

Error when checking target : sparse_categorical_crossentropy output shape

I am attempting to train InceptionV3 on a novel set of images using transfer learning. I am running into this issue - which clearly relates to a mismatch of input and output dimension (I think) but I can't seem to identify the issue). All relevant…
GhostRider
  • 2,109
  • 7
  • 35
  • 53
3
votes
2 answers

How does Keras deal with log(0) for categorical cross entropy?

I have a neural network, trained on MNIST, with categorical cross entropy as its loss function. For theoretical purposes my output layer is ReLu. Therefore a lot of its outputs are 0. Now I stumbled across the following question: Why don't I get a…
snoozzz
  • 135
  • 1
  • 1
  • 7
2
votes
2 answers

How to calculate cross entropy in keras for target values that aren't 0 or 1

Rather than training a neural network to output 1 or 0 through the output sigmoid layer, LeCun recommends (in the paper "Efficient BackProp" - LeCun et al, 1998, section 4.5): Choose target values at the point of the maximum second derivative on…
JMS
  • 1,039
  • 4
  • 12
  • 20
2
votes
1 answer

Why does the pytorch crossEntropyLoss use label encoding, instead of one-hot encoding?

I'm learning on CrossEntropyLoss module in pytorch. And the tutor says, you should input target value y with 'label encoded', not 'one-hot encoded'. Like this loss = nn.CrossEntropyLoss() Y = torch.tensor([0]) Y_pred_good = torch.tensor([[2.0, 1.0,…
yubin
  • 125
  • 8
2
votes
1 answer

Different cross entropy results from NumPy and PyTorch

My prediction is y_hat = [ 0.57,0.05,0.14,0.10,0.14] and target is target =[ 1, 0, 0, 0, 0 ]. I need to calculate Cross Entropy loss by NumPy and Pytorch loss function. Using NumPy my formula is -np.sum(target*np.log(y_hat)), and I got…
hello m
  • 21
  • 2
2
votes
1 answer

CrossEntropyLoss showing poor accuracy on 2d output

I'm trying some experiments on a simple neural network that just tries to learn the squares of some random numbers, represented as arrays of decimal digits, code copied below, with changes indicated by comments. The version using nn.Softmax(dim=2)…
rwallace
  • 31,405
  • 40
  • 123
  • 242