0

I'm using tensorflow to build a simple autoencoder model, however there's this strange bug that I can't diagnose I have a loss function that looks like this:

def loss_func(x,y):
        return 0.5 * tf.reduce_mean(tf.pow(x-y, 2 ) ) 

total loss is then calculated by:

return self.loss_func(x , input) + self.reg_fac * reg

now the problem is, when setting reg_fac to 0 the loss returns as a positive number and the models seems to train well, but when increasing reg_fac the loss decreases and reaches negative values and keeps decreasing

reg is calculated as this for each autoencoder used:

return tf.reduce_mean(tf.pow(self.w1, 2)) + tf.reduce_mean(tf.pow(self.w2, 2))

where w1 is the encoder weights and w2 is the decoder weights. I know it's a stupid bug, but I can't find it.

my complete code is uploaded here: https://github.com/javaWarrior/dltest

important files:
ae.py: autoencoders model,
sae.py: stacked autoencoders model,
mew.py: testing model on extracted features of nus_wide images using SIFT,
nus_wide.py: just an interface for nuswide
mohRamadan
  • 571
  • 6
  • 15

1 Answers1

1

I am not sure where your error is coming from but I believe that there are some problems with your autoencoder model in general. The simple model should look like this example taken from the tensorflow models repo.

    # model
    x = tf.placeholder(tf.float32, [None, len_input])
    h = tf.nn.softplus(tf.matmul(x, w1) + b1)
    xHat = tf.matmul(h, w2) + b

    # cost
    cost = 0.5 * tf.reduce_sum(tf.pow(xHat - x, 2.0))
    optimizer = tf.train.AdamOptimizer().minimize(cost)

As it pertains to the question, The key difference might be using reduce_sum() rather than reduce_mean(). I am not sure why you are wanting to use this.

Also, the AdamOptimizer should handle the regularization for you. As a side note, if you are wanting to learn by doing the regularization from scratch I would recommend this tutorial.

Joshua Howard
  • 876
  • 1
  • 12
  • 25
  • I was using adamoptimizer with cost having a regularization term. but I don't think that reduce_sum would make a difference, because it's a difference in scale only ? – mohRamadan Mar 03 '17 at 14:32
  • My apologies then. I usually allow the algorithm to handle regularization if I'm using Adagrad or some other extension of it. The difference would be in scale only, but it is the proper frobenius norm. I was assuming that there is something wrong with your model (that is why I just included an example). Could you update the question with your full model? – Joshua Howard Mar 03 '17 at 15:52
  • it's kind of a spaghetti code because I was just testing autoencoders but I will upload it anyway – mohRamadan Mar 03 '17 at 17:32