-1

I have a neural network of the form N = W1 * Tanh(W2 * I), where I is the Input Vector/Matrix. When I learn these weights the output has a certain form. However, when I add a normalization layer, for example, N' = Softmax( W1 * Tanh(W2 * I) ) however the in the output vector of N' a single element is close to 1 while the rest are almost zero. This is the case, not only with SoftMax() but with any normalizing layer. Is there any standard solution to such a problem?

Rumu
  • 403
  • 1
  • 3
  • 10
  • 1
    what do you mean by "certain form"? And why do you call it a problem? This is completely normal (and desired!) behaviour for normalizing in classification. What is the exact application (there is an attention tag yet no mention of attention in the question) – lejlot Oct 21 '17 at 19:05
  • It is a self-attention encoder-decoder model (as in N described above is a self-attention model) @lejlot By a certain form, I mean the output vector has certain characteristics (which are desired) like it increases till the middle and then decreases and increases alternately (e.g. 0.1,0.3,0.5, 1.5, 0.5, 1, 0.3, 1.2). However, after adding a Softmax Layer, I get something like this - (0.001, 0.001, 0, 0.01, 0.998, 0.001, 0, 0, ...). – Rumu Oct 22 '17 at 03:16
  • This simply means that the output `N` has one value significantly larger than others. Add the `N` value to the question. – Maxim Oct 22 '17 at 14:47

1 Answers1

0

That is the behavior of the softmax function. Perhaps what you need is a sigmoid function.

Julio Daniel Reyes
  • 5,489
  • 1
  • 19
  • 23