Use this tag for programming-related questions about the softmax function, also known as the normalized exponential function. Questions specific to a certain programming language should also be tagged with that language.
Questions tagged [softmax]
534 questions
5
votes
1 answer
is Cross Entropy With Softmax proper for Multi-label Classification?
As mentioned here, cross entropy is not a proper loss function for multi-label classification. My question is "is this fact true for cross entropy with softmax too?". If it is, how it can be matched with this part of the document.
I should mention…

OmG
- 18,337
- 10
- 57
- 90
5
votes
2 answers
tensorflow log_softmax tf.nn.log(tf.nn.softmax(predict)) tf.nn.softmax_cross_entropy_with_logits
I try to implement MNIST CNN neural network follow the tensorflow tutorial and find these ways to implement softmax cross entropy give different result:
(1) bad result
softmax = tf.nn.softmax(pred)
cross_entropy_cnn = - y * tf.log(softmax +…

Alex Wang
- 59
- 1
- 5
5
votes
1 answer
Does the Inception Model have two softmax outputs?
The Inception v3 model is shown in this image:
The image is from this blog-post:
https://research.googleblog.com/2016/03/train-your-own-image-classifier-with.html
It seems that there are two Softmax classification outputs. Why is that?
Which one is…

questiondude
- 772
- 7
- 15
5
votes
1 answer
The output of a softmax isn't supposed to have zeros, right?
I am working on a net in tensorflow which produces a vector which is then passed through a softmax which is my output.
Now I have been testing this and weirdly enough the vector (the one that passed through softmax) has zeros in all coordinate but…

Alperen AYDIN
- 547
- 1
- 6
- 17
5
votes
1 answer
Train TensorFlow language model with NCE or sampled softmax
I'm adapting the TensorFlow RNN tutorial to train a language model with a NCE loss or sampled softmax, but I still want to report perplexities. However, the perplexities I get are very weird: for NCE I get several millions (terrible!) whereas for…

niefpaarschoenen
- 560
- 1
- 8
- 19
5
votes
2 answers
Understanding softmax classifier
I am trying to understand a simple implementation of Softmax classifier from this link - CS231n - Convolutional Neural Networks for Visual Recognition. Here they implemented a simple softmax classifier. In the example of Softmax Classifier on the…

Shubhashis
- 10,411
- 11
- 33
- 48
4
votes
2 answers
what is the difference of torch.nn.Softmax, torch.nn.funtional.softmax, torch.softmax and torch.nn.functional.log_softmax
I tried to find documents but cannot find anything about torch.softmax.
What is the difference among torch.nn.Softmax, torch.nn.funtional.softmax, torch.softmax and torch.nn.functional.log_softmax?
Examples are appreciated.

olo
- 133
- 1
- 7
4
votes
1 answer
How to correctly use Cross Entropy Loss vs Softmax for classification?
I want to train a multi class classifier using Pytorch.
Following the official Pytorch doc shows how to use a nn.CrossEntropyLoss() after a last layer of type nn.Linear(84, 10).
However, I remember this is what Softmax does.
This leaves me…

Gulzar
- 23,452
- 27
- 113
- 201
4
votes
2 answers
How to plot a ROC curve from a softmax binary classifier with 2 output neurons?
How to plot the roc curve with discrete outputs labels as 2 columns?
Using the roc_curve() gives me an error:
ValueError: multilabel-indicator format is not supported
y_prediction = model.predict(test_X)
y_prediction[1]
Out[27]:…

Boels Maxence
- 349
- 3
- 16
4
votes
2 answers
torch.softmax and torch.sigmoid are not equivalent in the binary case
Given:
x_batch = torch.tensor([[-0.3, -0.7], [0.3, 0.7], [1.1, -0.7], [-1.1, 0.7]])
and then applying torch.sigmoid(x_batch):
tensor([[0.4256, 0.3318],
[0.5744, 0.6682],
[0.7503, 0.3318],
[0.2497, 0.6682]])
gives a…

CutePoison
- 4,679
- 5
- 28
- 63
4
votes
1 answer
why softmax get small gradient when the value is large in paper 'Attention is all you need'
This is the screen of the original paper: the screen of the paper. I understand the meaning of the paper is that when the value of dot-product is large, the gradient of softmax will get very small.
However, I tried to calculate the gradient of…

Richard. Zhu
- 63
- 8
4
votes
2 answers
why softmax_cross_entropy_with_logits_v2 return cost even same value
i have tested "softmax_cross_entropy_with_logits_v2"
with a random number
import tensorflow as tf
x = tf.placeholder(tf.float32,shape=[None,5])
y = tf.placeholder(tf.float32,shape=[None,5])
softmax =…

UfXpri
- 53
- 1
- 4
4
votes
1 answer
why not use the max value of output tensor instead of Softmax Function?
I built a CNN model on images one-class classification.
The output tensor is a list which has 65 elements. I make this tensor input to Softmax Function, and got the classified result.
I think the max value in this output tensor is the classified…

Li Shihao
- 45
- 5
4
votes
1 answer
How to calibrate the thresholds of neural network output layer in multiclass classification task?
Assume we have a multi-class classification task with 3 classes:
{Cheesecake, Ice Cream, Apple Pie}
Given that we have a trained neural network that can classify which of the three desserts a random chef would prefer. Also, assume that the output…

Mockingbird
- 1,023
- 9
- 17
4
votes
1 answer
What does "2-way" softmax mean?
I couldn't find a clear consensus on this.
What does 2-way softmax mean, and how is it different from n-way softmax?
The definition is given by Geoffrey Hinton in his Coursera course Neural Networks for Machine Learning in Quiz 4 to be:
a softmax…

Andriy Drozdyuk
- 58,435
- 50
- 171
- 272