Questions tagged [softmax]

Use this tag for programming-related questions about the softmax function, also known as the normalized exponential function. Questions specific to a certain programming language should also be tagged with that language.

534 questions
5
votes
1 answer

is Cross Entropy With Softmax proper for Multi-label Classification?

As mentioned here, cross entropy is not a proper loss function for multi-label classification. My question is "is this fact true for cross entropy with softmax too?". If it is, how it can be matched with this part of the document. I should mention…
OmG
  • 18,337
  • 10
  • 57
  • 90
5
votes
2 answers

tensorflow log_softmax tf.nn.log(tf.nn.softmax(predict)) tf.nn.softmax_cross_entropy_with_logits

I try to implement MNIST CNN neural network follow the tensorflow tutorial and find these ways to implement softmax cross entropy give different result: (1) bad result softmax = tf.nn.softmax(pred) cross_entropy_cnn = - y * tf.log(softmax +…
Alex Wang
  • 59
  • 1
  • 5
5
votes
1 answer

Does the Inception Model have two softmax outputs?

The Inception v3 model is shown in this image: The image is from this blog-post: https://research.googleblog.com/2016/03/train-your-own-image-classifier-with.html It seems that there are two Softmax classification outputs. Why is that? Which one is…
questiondude
  • 772
  • 7
  • 15
5
votes
1 answer

The output of a softmax isn't supposed to have zeros, right?

I am working on a net in tensorflow which produces a vector which is then passed through a softmax which is my output. Now I have been testing this and weirdly enough the vector (the one that passed through softmax) has zeros in all coordinate but…
Alperen AYDIN
  • 547
  • 1
  • 6
  • 17
5
votes
1 answer

Train TensorFlow language model with NCE or sampled softmax

I'm adapting the TensorFlow RNN tutorial to train a language model with a NCE loss or sampled softmax, but I still want to report perplexities. However, the perplexities I get are very weird: for NCE I get several millions (terrible!) whereas for…
niefpaarschoenen
  • 560
  • 1
  • 8
  • 19
5
votes
2 answers

Understanding softmax classifier

I am trying to understand a simple implementation of Softmax classifier from this link - CS231n - Convolutional Neural Networks for Visual Recognition. Here they implemented a simple softmax classifier. In the example of Softmax Classifier on the…
Shubhashis
  • 10,411
  • 11
  • 33
  • 48
4
votes
2 answers

what is the difference of torch.nn.Softmax, torch.nn.funtional.softmax, torch.softmax and torch.nn.functional.log_softmax

I tried to find documents but cannot find anything about torch.softmax. What is the difference among torch.nn.Softmax, torch.nn.funtional.softmax, torch.softmax and torch.nn.functional.log_softmax? Examples are appreciated.
olo
  • 133
  • 1
  • 7
4
votes
1 answer

How to correctly use Cross Entropy Loss vs Softmax for classification?

I want to train a multi class classifier using Pytorch. Following the official Pytorch doc shows how to use a nn.CrossEntropyLoss() after a last layer of type nn.Linear(84, 10). However, I remember this is what Softmax does. This leaves me…
Gulzar
  • 23,452
  • 27
  • 113
  • 201
4
votes
2 answers

How to plot a ROC curve from a softmax binary classifier with 2 output neurons?

How to plot the roc curve with discrete outputs labels as 2 columns? Using the roc_curve() gives me an error: ValueError: multilabel-indicator format is not supported y_prediction = model.predict(test_X) y_prediction[1] Out[27]:…
Boels Maxence
  • 349
  • 3
  • 16
4
votes
2 answers

torch.softmax and torch.sigmoid are not equivalent in the binary case

Given: x_batch = torch.tensor([[-0.3, -0.7], [0.3, 0.7], [1.1, -0.7], [-1.1, 0.7]]) and then applying torch.sigmoid(x_batch): tensor([[0.4256, 0.3318], [0.5744, 0.6682], [0.7503, 0.3318], [0.2497, 0.6682]]) gives a…
CutePoison
  • 4,679
  • 5
  • 28
  • 63
4
votes
1 answer

why softmax get small gradient when the value is large in paper 'Attention is all you need'

This is the screen of the original paper: the screen of the paper. I understand the meaning of the paper is that when the value of dot-product is large, the gradient of softmax will get very small. However, I tried to calculate the gradient of…
4
votes
2 answers

why softmax_cross_entropy_with_logits_v2 return cost even same value

i have tested "softmax_cross_entropy_with_logits_v2" with a random number import tensorflow as tf x = tf.placeholder(tf.float32,shape=[None,5]) y = tf.placeholder(tf.float32,shape=[None,5]) softmax =…
UfXpri
  • 53
  • 1
  • 4
4
votes
1 answer

why not use the max value of output tensor instead of Softmax Function?

I built a CNN model on images one-class classification. The output tensor is a list which has 65 elements. I make this tensor input to Softmax Function, and got the classified result. I think the max value in this output tensor is the classified…
Li Shihao
  • 45
  • 5
4
votes
1 answer

How to calibrate the thresholds of neural network output layer in multiclass classification task?

Assume we have a multi-class classification task with 3 classes: {Cheesecake, Ice Cream, Apple Pie} Given that we have a trained neural network that can classify which of the three desserts a random chef would prefer. Also, assume that the output…
4
votes
1 answer

What does "2-way" softmax mean?

I couldn't find a clear consensus on this. What does 2-way softmax mean, and how is it different from n-way softmax? The definition is given by Geoffrey Hinton in his Coursera course Neural Networks for Machine Learning in Quiz 4 to be: a softmax…
Andriy Drozdyuk
  • 58,435
  • 50
  • 171
  • 272