Questions tagged [cross-entropy]

In machine learning and information theory, the cross entropy is a measure of distance (inverse similarity) between two probability distributions over the same underlying set of events. Cross entropy is the common choice of the loss function in neural networks for classification tasks.

360 questions
2
votes
0 answers

How can I avoid log(0) in the implementation of the categorical cross-entropy loss function?

I have implemented the cross-entropy and its gradient in Python, but I'm not sure if it's correct. My implementation is for a neural network. yEst = np.array([1, 6, 3, 5]).T # output of a softmax function in last layer y = np.array([0, 6, 3,…
2
votes
2 answers

Suppress use of Softmax in CrossEntropyLoss for PyTorch Neural Net

I know theres no need to use a nn.Softmax() Function in the output layer for a neural net when using nn.CrossEntropyLoss as a loss function. However I need to do so, is there a way to suppress the implemented use of softmax in nn.CrossEntropyLoss…
Quastiat
  • 1,164
  • 1
  • 18
  • 37
2
votes
0 answers

How does the gradient descent work when the one hot encoded labels are all zeros?

I've got a dataset of images of the vineyard taken from above I'd like to sub-classify, there are 3 big classes [ground,vegetal,vineyard] and some of them are divided into several subclasses : vegetal:[grass,flower], vineyard:[healthy, disease A,…
2
votes
0 answers

Tensorflow/Keras: Cost function that penalizes specific errors/confusions

I have a classification scenario with more than 10 classes where one class is a dedicated "garbage" class. With a CNN I currently reach accuracies around 96%, which is good enough for me. In this particular application false positives (recognizing…
Johannes
  • 3,300
  • 2
  • 20
  • 35
2
votes
1 answer

The output of softmax makes the binary cross entropy's output NAN, what should I do?

I have implemented a neural network in Tensorflow where the last layer is a convolution layer, I feed the output of this convolution layer into a softmax activation function then I feed it to a cross-entropy loss function which is defined as follows…
MRM
  • 1,099
  • 2
  • 12
  • 29
2
votes
2 answers

RuntimeError: Assertion `cur_target >= 0 && cur_target < n_classes' failed

I get: RuntimeError: Assertion `cur_target >= 0 && cur_target < n_classes' failed. at /opt/conda/conda-bld/pytorch_1550796191843/work/aten/src/THNN/generic/ClassNLLCriterion.c:93 When running this code: criterion = nn.CrossEntropyLoss() …
2
votes
1 answer

Custom parameters in cross-entropy Keras

I need to build custom categorical cross entropy loss function where I should compare y_true and Q*y_pred instead of just y_pred. Q is a matrix. The problem is that the batch size must not be equal to 1. So, there is a problem with dimensions. How…
ofi
  • 23
  • 3
2
votes
3 answers

How to calculate binary cross-entropy between a predicted and a test set in python?

I'm using a test list and a prediction list which contains 4000 elements like in this example test_list=[1,0,0,1,0,.....] prediction_list=[1,1,0,1,0......] How can I find the binary cross entropy between these 2 lists in terms of python code?…
Alastor
  • 101
  • 1
  • 7
2
votes
0 answers

Why is binary_crossentropy performing better than categorical_crossentropy for multiclass classification in Keras?

I've seen many similar issues in stackoverflow but none of this refer to my case. I have a multiclass classification problem and my labels are mutually exclusive. Training with a binary_crossentropy due to a typo, resulted in lower loss and higher…
Lara Larsen
  • 365
  • 1
  • 2
  • 16
2
votes
0 answers

Single sample gradient using tf.gradients just one time (TypeError: Fetch argument None has invalid type )

So I am running this code for taking single sample gradients in tensorflow, using tf. gradients() just once. It works totally fine when I just have one set of variables: def f(x, W): return tf.matmul(x, W) graph1 = tf.Graph() with…
ebramos
  • 103
  • 1
  • 10
2
votes
1 answer

caffe softmax with loss layer for semantic segmentation loss calculation

The caffe documentation on the softmax_loss_layer.hpp file seems to be targeted towards classification tasks and not semantic segmentation. However, I have seen this layer being used for the latter. What would be the dimensions of the input blobs…
simplename
  • 717
  • 7
  • 15
2
votes
1 answer

What is the difference between CrossEntropy and NegativeLogLikelihood in MXNet?

I was trying to evaluate my classification models with log-loss metric using mxnet.metric module. I came across two classes: CrossEntropy and NegativeLogLikelihood which have the same definition and very similar implementation. Both have the same…
2
votes
2 answers

Why is the reduce_mean applied to the output of sparse_softmax_cross_entropy_with_logits?

There are several tutorials that applied reduce_mean to the output of sparse_softmax_cross_entropy_with_logits. For example cross_entropy = -tf.reduce_sum(y_ * tf.log(y_conv)) or cross_entropy =…
Hong Cheng
  • 318
  • 1
  • 4
  • 19
2
votes
1 answer

Why is the code for a neural network with a sigmoid so different than the code with softmax_cross_entropy_with_logits?

When using neural networks for classification, it is said that: You generally want to use softmax cross-entropy output, as this gives you the probability of each of the possible options. In the common case where there are only two options, you want…
rwallace
  • 31,405
  • 40
  • 123
  • 242
2
votes
1 answer

How to batch compute cross entropy for pointer networks?

In pointer networks the output logits are over the length of the inputs. Working with such batches means padding the inputs to the maximum length of the batch inputs. Now, this is all fine till we have to compute loss. Currently what i am doing is…
figs_and_nuts
  • 4,870
  • 2
  • 31
  • 56