Questions tagged [softmax]

Use this tag for programming-related questions about the softmax function, also known as the normalized exponential function. Questions specific to a certain programming language should also be tagged with that language.

534 questions
16
votes
3 answers

How to change the temperature of a softmax output in Keras

I am currently trying to reproduce the results of the following article. http://karpathy.github.io/2015/05/21/rnn-effectiveness/ I am using Keras with the theano backend. In the article he talks about controlling the temperature of the final…
chasep255
  • 11,745
  • 8
  • 58
  • 115
15
votes
1 answer

About tf.nn.softmax_cross_entropy_with_logits_v2

I have noticed that tf.nn.softmax_cross_entropy_with_logits_v2(labels, logits) mainly performs 3 operations: Apply softmax to the logits (y_hat) in order to normalize them: y_hat_softmax = softmax(y_hat). Compute the cross-entropy loss: y_cross =…
lifang
  • 1,485
  • 3
  • 16
  • 23
15
votes
5 answers

How to implement the Softmax derivative independently from any loss function?

For a neural networks library I implemented some activation functions and loss functions and their derivatives. They can be combined arbitrarily and the derivative at the output layers just becomes the product of the loss derivative and the…
danijar
  • 32,406
  • 45
  • 166
  • 297
13
votes
2 answers

How best to deal with "None of the above" in Image Classification?

This seems to be a fundamental question which some of you out there must have an opinion on. I have an image classifier implemented in CNTK with 48 classes. If the image does not match any of the 48 classes very well, then I'd like to be able to…
Tullhead
  • 565
  • 2
  • 7
  • 17
12
votes
6 answers

TypeError: softmax() got an unexpected keyword argument 'axis'

When I use this it does not give any error out_layer = tf.add(tf.matmul(layer_4 , weights['out']) , biases['out']) out_layer = tf.nn.softmax(out_layer) But when I use this model=Sequential() model.add(Dense(100, input_dim= n_dim,…
Aakash aggarwal
  • 443
  • 2
  • 6
  • 21
12
votes
1 answer

Why does TensorFlow's documentation call a softmax's input "logits"?

TensorFlow calls each of the inputs to a softmax a logit. They go on to define the softmax's inputs/logits as: "Unscaled log probabilities." Wikipedia and other sources say that a logit is the log of the odds, and the inverse of the sigmoid/logistic…
12
votes
2 answers

Softmax matrix to 0/1 (OneHot) encoded matrix?

Suppose I have the following tensor t as the output of a softmax function: t = tf.constant(value=[[0.2,0.8], [0.6, 0.4]]) >> [ 0.2, 0.8] [ 0.6, 0.4] Now I would like to convert this matrix t into a matrix that resembles the OneHot encoded…
Davor Josipovic
  • 5,296
  • 1
  • 39
  • 57
11
votes
2 answers

How is the categorical_crossentropy implemented in keras?

I'm trying to apply the concept of distillation, basically to train a new smaller network to do the same as the original one but with less computation. I have the softmax outputs for every sample instead of the logits. My question is, how is the…
Eric
  • 1,108
  • 3
  • 11
  • 25
11
votes
2 answers

Derivative of a softmax function explanation

I am trying to compute the derivative of the activation function for softmax. I found this : https://math.stackexchange.com/questions/945871/derivative-of-softmax-loss-function nobody seems to give the proper derivation for how we would get the…
Roshini
  • 703
  • 2
  • 8
  • 21
9
votes
2 answers

Per pixel softmax for fully convolutional network

I'm trying to implement something like a fully convolutional network, where the last convolution layer uses filter size 1x1 and outputs a 'score' tensor. The score tensor has shape [Batch, height, width, num_classes]. My question is, what function…
Wei Liu
  • 1,004
  • 1
  • 10
  • 17
8
votes
3 answers

Implementation of softmax function returns nan for high inputs

I am trying to implement softmax at the end of cnn, The output I got is nan and zeros. I am giving high input values to softmax around 10-20k I'm giving an array of X=[2345,3456,6543,-6789,-9234] My function is def softmax (X): B=np.exp(X) …
Alok Ranjan Swain
  • 109
  • 1
  • 1
  • 7
8
votes
2 answers

Softmax derivative in NumPy approaches 0 (implementation)

I'm trying to implement the softmax function for a neural network written in Numpy. Let h be the softmax value of a given signal i. I've struggled to implement the softmax activation function's partial derivative. I'm currently stuck at issue…
jorgenkg
  • 4,140
  • 1
  • 34
  • 48
7
votes
1 answer

What is a dimensional range of [-1,0] in Pytorch?

So I'm struggling to understand some terminology about collections in Pytorch. I keep running into the same kinds of errors about the range of my tensors being incorrect, and when I try to Google for a solution often the explanations are further…
Reactgular
  • 52,335
  • 19
  • 158
  • 208
7
votes
4 answers

Logsoftmax stability

I know how to make softmax stable by adding to element -max _i x_i. This avoids overflow and underflow. Now, taking log of this can cause underflow. log softmax(x) can evaluate to zero, leading to -infinity. I am not sure how to fix it. I know this…
Abhishek Bhatia
  • 9,404
  • 26
  • 87
  • 142
7
votes
1 answer

semantic segmentation with tensorflow - ValueError in loss function (sparse-softmax)

So, I'm working on a building a fully convolutional network (FCN), based off of Marvin Teichmann's tensorflow-fcn My input image data, for the time being is a 750x750x3 RGB image. After running through the network, I use logits of shape…
Shiva
  • 473
  • 1
  • 6
  • 21
1
2
3
35 36