1

Setup

I build an autoencoder with Tensorflow for images. My images are around 30 Pixels in length and width. I am using 5 layers:

  1. Input layer
  2. Encoder layer with 256 neurons with linear functions. (This layer is supposed to function sort of as a preprocessing PCA.)
  3. Encoder layer with 128 neurons with sigmoid functions.
  4. Decoder layer with 256 neurons with sigmoid functions.
  5. Decoder/output layer with as many neurons as the input with linear function.

All layer use biases and are defined like this

layer_1 = tf.nn.sigmoid(tf.add(
   tf.matmul(x, tf.Variable(tf.random_normal([n_input, n_hidden_1]))),
   tf.Variable(tf.random_normal([n_hidden_1]))
))

My cost is defined by

cost = tf.reduce_mean(tf.div(tf.reduce_sum(tf.pow(y_true - y_pred, 2)), 2))

and I used RMSPropOptimizer with start learning rate of 0.01.

I trained the autoencoder with around 250000 images. In the following I show the results for test data, with which I did not train. The upper row always shows the input images, the lower row always shows the output images.

I used the autoencoder satisfactory on the MNIST data (array cells are values from 0-1): enter image description here

I used the autoencoder satisfactory on the photos I took myself (array cells are values from 0-100): enter image description here

Where the root mean square errors

rmse = tf.sqrt(tf.reduce_mean(tf.square(tf.sub(y_true, y_pred)), axis=1))

are as one would expect - higher for the first three photos (darkened, whitened, put a cross in), and lower for the unchanged photos:

17.5, 29.6, 12.9, 11.7, 11.2, 11.7, 7.3, 7.1, 7.1, 8.1

But the autoencoder does not work satisfactory for texture images (array cells are from 0-1). First I used RMSPropOptimizer and got

enter image description here

Since my cost where pretty high during training and did not change I followed the advice from https://stackoverflow.com/a/40956761/4533188 and used AdamOptimizer. Indeed I got better results with

enter image description here

While the costs are lower, they are still constant during the epochs:

enter image description here

Also the output images are pretty dark. I believe this might also the reason why my rmses are is not as I would expect:

0.4642, 0.2669, 0.4976, 0.4378, 0.4753, 0.4688, 0.4615, 0.4571, 0.4691, 0.4487

Please note, that I would expect that the rmses is high for the first image due to the dot in the middle and the second image since it is darkened. I guess the reason the rmses are not as hoped, is because the output images are so dark.

Questions

  1. Why are the costs not decreasing over the epochs and what can I do to get this going?
  2. Why are the output images so dark and what can I do to get them to be better representations of the input images?
  3. How do I get the rmses to be as I expect?
Community
  • 1
  • 1
Make42
  • 12,236
  • 24
  • 79
  • 155

1 Answers1

0

Your network is clearly not training

  • Your Autoenconder Layers are not symmetric
  • Don't use sigmoid, use ReLU
  • Use a better initialization technique (the specific distribution depends on the activation function)
  • Share the weight of the Encoder and Decoder layers
fabrizioM
  • 46,639
  • 15
  • 102
  • 119
  • 1. What is your recommendation for the layers if I want the output to be linear and have the first hidden layer to be linear? 3. What do you recommend for sigmoid, linear, ReLU? 4. What do you mean? Share them on SO? – Make42 Feb 21 '17 at 20:27
  • 2. Why should I use ReLU instead of sigmoid? – Make42 Feb 21 '17 at 21:24
  • ReLU has proven to be easier to train than sigmoid. I recommend you study this very good resource to be succesfull in developing neural networks; deeplearningbook.org – fabrizioM Feb 21 '17 at 21:33
  • How do I initialise with other initialisers? I read that it was a good idea to use the xavier initilalizer. I found https://www.tensorflow.org/api_docs/python/tf/contrib/layers/xavier_initializer. How do I use it in my setup? – Make42 Feb 22 '17 at 18:07