0

This question might be very specific application related but I was blocked and I thought this is a good place to ask. Let's say we have an LSTM in Keras that is sequence to sequence, for example Part of Speech Tagger. The last layer gives me, sequence of labels with the probability of each label. Consider the following predicted output;

A = [[0.1, 0.3, 0.2, 0.4],[0.2, 0.2, 0.2, 0.4],[0.5, 0.2, 0.1, 0.1]]

Basically this is a sequence of length 3 that has 4 possible tags at each time point of the sequence.

Now what I would like to do is change this sequence into following.

A' = [[0, 0, 0, 1],[0, 0, 0, 1],[1, 0, 0, 0]]

In other words, I want to put one at the location of the maximum probability and change all other ones to 0. Helps are very appreciated.

Rouzbeh
  • 2,141
  • 1
  • 11
  • 19

1 Answers1

3

you can use this slightly modified version of a sampling function:

def set_max_to_one(preds, temperature=0.01):
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds.T / np.sum(exp_preds, axis=1)
    return preds.astype("int16").T

This returns what you expect. You can fiddle around with the temperature so that it is stable and doesn't return NA, but using 0.01 should be good enough. You might also want to change the type of the output array.

Note that this will work if you use a numpy array object, if you want to use it for a keras tensor you will need to modify it (keeping into account batch size for example). Hope this helps

EDIT:

This should work in keras:

import keras.backend as K

def set_max_to_one(x, temperature=0.01):
    x = K.log(x)/temperature
    return K.round(K.softmax(x))

Instead of backend.softmax() you could use layers.core.Activation() if you wanted to set the axis value.

Note that the output is still a tensor of float, not of int, but I can't find out how to change the tensor type. It shouldn't make much difference.

gionni
  • 1,284
  • 1
  • 12
  • 32
  • Thanks @gionni, yeah exactly I am looking for tensor version implementation because I am doing this in Keras. I feel like once it is done in Keras it becomes part of the graph. – Rouzbeh Jun 29 '17 at 16:16