Sample softmax is used when you have high number of output classes.
The main reason is if you use normal softmax loss for high number of output classes , lets say 5000 , it's very inefficient and heave for our computer to calculate. So sample softmax is something that will take care only k number of classes from total number of classes when calculating the softmax loss.
One example this is used is sequence to sequence models in tensorflow.
These modules can predict things that occur in sequential manner.Let's say given a sentence predict the next word. So here in order to predict the word you have many output classes. In this case it is equal to vocabulary size. So sample softmax is very handy in this.
Link to the tensorflow seq2seq models