Use this tag for programming-related questions about the softmax function, also known as the normalized exponential function. Questions specific to a certain programming language should also be tagged with that language.
Questions tagged [softmax]
534 questions
16
votes
3 answers
How to change the temperature of a softmax output in Keras
I am currently trying to reproduce the results of the following article.
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
I am using Keras with the theano backend. In the article he talks about controlling the temperature of the final…

chasep255
- 11,745
- 8
- 58
- 115
15
votes
1 answer
About tf.nn.softmax_cross_entropy_with_logits_v2
I have noticed that tf.nn.softmax_cross_entropy_with_logits_v2(labels, logits) mainly performs 3 operations:
Apply softmax to the logits (y_hat) in order to normalize them: y_hat_softmax = softmax(y_hat).
Compute the cross-entropy loss: y_cross =…

lifang
- 1,485
- 3
- 16
- 23
15
votes
5 answers
How to implement the Softmax derivative independently from any loss function?
For a neural networks library I implemented some activation functions and loss functions and their derivatives. They can be combined arbitrarily and the derivative at the output layers just becomes the product of the loss derivative and the…

danijar
- 32,406
- 45
- 166
- 297
13
votes
2 answers
How best to deal with "None of the above" in Image Classification?
This seems to be a fundamental question which some of you out there must have an opinion on. I have an image classifier implemented in CNTK with 48 classes. If the image does not match any of the 48 classes very well, then I'd like to be able to…

Tullhead
- 565
- 2
- 7
- 17
12
votes
6 answers
TypeError: softmax() got an unexpected keyword argument 'axis'
When I use this it does not give any error
out_layer = tf.add(tf.matmul(layer_4 , weights['out']) , biases['out'])
out_layer = tf.nn.softmax(out_layer)
But when I use this
model=Sequential()
model.add(Dense(100, input_dim= n_dim,…

Aakash aggarwal
- 443
- 2
- 6
- 21
12
votes
1 answer
Why does TensorFlow's documentation call a softmax's input "logits"?
TensorFlow calls each of the inputs to a softmax a logit. They go on to define the softmax's inputs/logits as: "Unscaled log probabilities."
Wikipedia and other sources say that a logit is the log of the odds, and the inverse of the sigmoid/logistic…

Brian Bartoldson
- 884
- 9
- 20
12
votes
2 answers
Softmax matrix to 0/1 (OneHot) encoded matrix?
Suppose I have the following tensor t as the output of a softmax function:
t = tf.constant(value=[[0.2,0.8], [0.6, 0.4]])
>> [ 0.2, 0.8]
[ 0.6, 0.4]
Now I would like to convert this matrix t into a matrix that resembles the OneHot encoded…

Davor Josipovic
- 5,296
- 1
- 39
- 57
11
votes
2 answers
How is the categorical_crossentropy implemented in keras?
I'm trying to apply the concept of distillation, basically to train a new smaller network to do the same as the original one but with less computation.
I have the softmax outputs for every sample instead of the logits.
My question is, how is the…

Eric
- 1,108
- 3
- 11
- 25
11
votes
2 answers
Derivative of a softmax function explanation
I am trying to compute the derivative of the activation function for softmax. I found this : https://math.stackexchange.com/questions/945871/derivative-of-softmax-loss-function nobody seems to give the proper derivation for how we would get the…

Roshini
- 703
- 2
- 8
- 21
9
votes
2 answers
Per pixel softmax for fully convolutional network
I'm trying to implement something like a fully convolutional network, where the last convolution layer uses filter size 1x1 and outputs a 'score' tensor. The score tensor has shape [Batch, height, width, num_classes].
My question is, what function…

Wei Liu
- 1,004
- 1
- 10
- 17
8
votes
3 answers
Implementation of softmax function returns nan for high inputs
I am trying to implement softmax at the end of cnn, The output I got is nan and zeros. I am giving high input values to softmax around 10-20k I'm giving an array of X=[2345,3456,6543,-6789,-9234]
My function is
def softmax (X):
B=np.exp(X)
…

Alok Ranjan Swain
- 109
- 1
- 1
- 7
8
votes
2 answers
Softmax derivative in NumPy approaches 0 (implementation)
I'm trying to implement the softmax function for a neural network written in Numpy. Let h be the softmax value of a given signal i.
I've struggled to implement the softmax activation function's partial derivative.
I'm currently stuck at issue…

jorgenkg
- 4,140
- 1
- 34
- 48
7
votes
1 answer
What is a dimensional range of [-1,0] in Pytorch?
So I'm struggling to understand some terminology about collections in Pytorch. I keep running into the same kinds of errors about the range of my tensors being incorrect, and when I try to Google for a solution often the explanations are further…

Reactgular
- 52,335
- 19
- 158
- 208
7
votes
4 answers
Logsoftmax stability
I know how to make softmax stable by adding to element -max _i x_i. This avoids overflow and underflow.
Now, taking log of this can cause underflow. log softmax(x) can evaluate to zero, leading to -infinity.
I am not sure how to fix it. I know this…

Abhishek Bhatia
- 9,404
- 26
- 87
- 142
7
votes
1 answer
semantic segmentation with tensorflow - ValueError in loss function (sparse-softmax)
So, I'm working on a building a fully convolutional network (FCN), based off of Marvin Teichmann's tensorflow-fcn
My input image data, for the time being is a 750x750x3 RGB image.
After running through the network, I use logits of shape…

Shiva
- 473
- 1
- 6
- 21