0

Could someone give me a simple example on sampled_softmax_loss of the Tensorflow?

I tried to change the softmax and the cross_entropy of the tutorial with the sampled_softmax_loss with different numbers for num_sampled but the results are really bad.

Cfis Yoi
  • 241
  • 6
  • 15
  • Just to clarify, you know how to use it but want an example of it working successfully in a model? – Allen Lavoie Nov 03 '16 at 21:35
  • Let's see. What i have understood reading the doc, this op applies the sampling for softmax, in order to avoid computing the sum for every class, and finally returns the training loss. Is that what really does? In the code, what i' ve done, is replacing the lines of the softmax and cross_entropy by the sampled_softmax_loss only. – Cfis Yoi Nov 04 '16 at 14:02
  • Ah. Yes, sampled_softmax_loss [computes the cross entropy after sampling](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/nn.py#L1413). The documentation should probably be more explicit about that. – Allen Lavoie Nov 04 '16 at 21:34

1 Answers1

1

Sample softmax is used when you have high number of output classes. The main reason is if you use normal softmax loss for high number of output classes , lets say 5000 , it's very inefficient and heave for our computer to calculate. So sample softmax is something that will take care only k number of classes from total number of classes when calculating the softmax loss.

One example this is used is sequence to sequence models in tensorflow.

These modules can predict things that occur in sequential manner.Let's say given a sentence predict the next word. So here in order to predict the word you have many output classes. In this case it is equal to vocabulary size. So sample softmax is very handy in this. Link to the tensorflow seq2seq models

Shamane Siriwardhana
  • 3,951
  • 6
  • 33
  • 73