3

I have just begun using lasagne and Theano to do some machine learning on Python.

I am trying to modify the softmax class in Theano. I want to change how the activation function(softmax) is calculated. Instead of dividing e_x by e_x.sum(axis=1), I want to divide e_x by sum of three consecutive numbers.

For instance, the result will be as follows:

sm[0] = e_x[0]/(e_x[0]+e_x[1]+e_x[2])
sm[1] = e_x[1]/(e_x[0]+e_x[1]+e_x[2])
sm[2] = e_x[2]/(e_x[0]+e_x[1]+e_x[2])
sm[3] = e_x[3]/(e_x[3]+e_x[4]+e_x[5])
sm[4] = e_x[4]/(e_x[3]+e_x[4]+e_x[5])
sm[5] = e_x[5]/(e_x[3]+e_x[4]+e_x[5])

and so on...

The problem is that I cannot quite grasp how theano carries out the computation.

Here is my main question. Does it suffice to just change the perform() function in the softmax class?

Here is the original perform() function:

def perform(self, node, input_storage, output_storage):
    x, = input_storage
    e_x = numpy.exp(x - x.max(axis=1)[:, None])
    sm = e_x / e_x.sum(axis=1)[:, None]
    output_storage[0][0] = sm

Here is my modified perform()

def myPerform(self, node, input_storage, output_storage):
    x, = input_storage
    e_x = numpy.exp(x - x.max(axis=1)[:, None])
    sm = numpy.zeros_like(e_x)
    for i in range(0,symbolCount):
        total = e_x[3*i] + e_x[3*i+1] + e_x[3*i+2]
        sm[3*i] = e_x[3*i]/total
        sm[3*i+1] = e_x[3*i+1]/total
        sm[3*i+2] = e_x[3*i+2]/total
    output_storage[0][0] = sm

With the current code, I am getting 'unorderable types:int()>str()' error when I use the predict method in lasagne.

icedwater
  • 4,701
  • 3
  • 35
  • 50

1 Answers1

2

For something like this you're probably better off constructing a custom softmax via symbolic expressions rather than creating (or modifying) an operation.

Your custom softmax can be defined in terms of symbolic expressions. Doing it this way will give you gradients (and other Theano operation bits and pieces) "for free" but might run slightly slower than a custom operation could.

Here's an example:

import numpy
import theano
import theano.tensor as tt

x = tt.matrix()

# Use the built in softmax operation
y1 = tt.nnet.softmax(x)

# A regular softmax operation defined via ordinary Theano symbolic expressions
y2 = tt.exp(x)
y2 = y2 / y2.sum(axis=1)[:, None]

# Custom softmax operation
def custom_softmax(a):
    b = tt.exp(a)
    b1 = b[:, :3] / b[:, :3].sum(axis=1)[:, None]
    b2 = b[:, 3:] / b[:, 3:].sum(axis=1)[:, None]
    return tt.concatenate([b1, b2], axis=1)
y3 = custom_softmax(x)

f = theano.function([x], outputs=[y1, y2, y3])

x_value = [[.1, .2, .3, .4, .5, .6], [.1, .3, .5, .2, .4, .6]]
y1_value, y2_value, y3_value = f(x_value)
assert numpy.allclose(y1_value, y2_value)
assert y3_value.shape == y1_value.shape
a = numpy.exp(.1) + numpy.exp(.2) + numpy.exp(.3)
b = numpy.exp(.4) + numpy.exp(.5) + numpy.exp(.6)
c = numpy.exp(.1) + numpy.exp(.3) + numpy.exp(.5)
d = numpy.exp(.2) + numpy.exp(.4) + numpy.exp(.6)
assert numpy.allclose(y3_value, [
    [numpy.exp(.1) / a, numpy.exp(.2) / a, numpy.exp(.3) / a, numpy.exp(.4) / b, numpy.exp(.5) / b, numpy.exp(.6) / b],
    [numpy.exp(.1) / c, numpy.exp(.3) / c, numpy.exp(.5) / c, numpy.exp(.2) / d, numpy.exp(.4) / d, numpy.exp(.6) / d]
]), y3_value
Daniel Renshaw
  • 33,729
  • 8
  • 75
  • 94
  • The problem is that I'd like to modify the class softmax because the class is used multiple times throughout the computation. I am wondering if it is possible to create a newSoftmax class using symbolic expressions. I wasn't able to find any documentation on it. – Jin Hoon Jeffrey Bang Sep 02 '15 at 02:51
  • Just create a Python function containing the custom softmax code. You can then call that function as many times as you like. A simple Theano operation is anything that is Python callable and that accepts and returns symbolic variables. – Daniel Renshaw Sep 02 '15 at 06:49
  • Is it possible to create a class instead of a function? I am not familiar with how lasagne works. lasagne/nonlinearities.py def softmax(x): return theano.tensor.nnet.softmax(x) I have to work with this function which just returns a class. Should I just implement the symbolic expressions inside this function? If so, what should it return. – Jin Hoon Jeffrey Bang Sep 03 '15 at 04:03
  • `theano.tensor.nnet.softmax` returns an object, a symbolic variable. So too does `custom_softmax`. `custom_softmax` can be used as a direct replacement for lasagne's `softmax` function. – Daniel Renshaw Sep 03 '15 at 06:16