I am trying to implement an autoencoder, using the regularization method described in this paper: "Saturating Auto-Encoders", Goroshin et. al., 2013
Essentially, this tries to minimize the difference between the output of the hidden layer, and the flat portions of the non-linear function being used to compute the hidden layer output.
Assuming we are using a step function as the nonlinearity, with the step being at 0.5, a simple implementation might be:
for i in range(len(y)):
if y[i] < 0.5:
y_prime[i] = 0
else:
y_prime[i] = 1
Then, the regularization cost can be simply:
(numpy.abs(y-y_prime).sum()
I am trying to implement this functionality in Theano. Have started off with the denoising autoencoder code available at the Theano website. Have made some basic modifications to it:
def get_cost_updates(self, learning_rate, L1_reg):
# I do not want to add noise as of now, hence no input corruption.
# Directly compute the hidden layer values.
y = self.get_hidden_values(self.x)
z = self.get_reconstructed_input(y)
# Also, the original code computes the cross entropy loss function.
# I want to use the quadratic loss as my inputs are real valued, not
# binary. Further, I have added an L1 regularization term to the hidden
# layer values.
L = 0.5*T.sum((self.x-z)**2, axis=1) + L1_reg*(abs(y).sum())
... # Rest of it is same as original.
The above loss functions puts an L1 penalty on the hidden layer output, which should (hopefully) drive most of them to 0. In place of this simple L1 penalty, I want to use the saturating penalty as given above.
Any idea how to do this? Where do I compute the y_prime? How to do it symbolically?
I am a newbie to Theano, and still catching up with the symbolic computation part.