Tensorflow autoencoder: How to get representative output?

Question

Setup

I build an autoencoder with Tensorflow for images. My images are around 30 Pixels in length and width. I am using 5 layers:

Input layer
Encoder layer with 256 neurons with linear functions. (This layer is supposed to function sort of as a preprocessing PCA.)
Encoder layer with 128 neurons with sigmoid functions.
Decoder layer with 256 neurons with sigmoid functions.
Decoder/output layer with as many neurons as the input with linear function.

All layer use biases and are defined like this

layer_1 = tf.nn.sigmoid(tf.add(
   tf.matmul(x, tf.Variable(tf.random_normal([n_input, n_hidden_1]))),
   tf.Variable(tf.random_normal([n_hidden_1]))
))

My cost is defined by

cost = tf.reduce_mean(tf.div(tf.reduce_sum(tf.pow(y_true - y_pred, 2)), 2))

and I used RMSPropOptimizer with start learning rate of 0.01.

I trained the autoencoder with around 250000 images. In the following I show the results for test data, with which I did not train. The upper row always shows the input images, the lower row always shows the output images.

I used the autoencoder satisfactory on the MNIST data (array cells are values from 0-1):

I used the autoencoder satisfactory on the photos I took myself (array cells are values from 0-100):

Where the root mean square errors

rmse = tf.sqrt(tf.reduce_mean(tf.square(tf.sub(y_true, y_pred)), axis=1))

are as one would expect - higher for the first three photos (darkened, whitened, put a cross in), and lower for the unchanged photos:

17.5, 29.6, 12.9, 11.7, 11.2, 11.7, 7.3, 7.1, 7.1, 8.1

But the autoencoder does not work satisfactory for texture images (array cells are from 0-1). First I used RMSPropOptimizer and got

Since my cost where pretty high during training and did not change I followed the advice from https://stackoverflow.com/a/40956761/4533188 and used AdamOptimizer. Indeed I got better results with

While the costs are lower, they are still constant during the epochs:

Also the output images are pretty dark. I believe this might also the reason why my rmses are is not as I would expect:

0.4642, 0.2669, 0.4976, 0.4378, 0.4753, 0.4688, 0.4615, 0.4571, 0.4691, 0.4487

Please note, that I would expect that the rmses is high for the first image due to the dot in the middle and the second image since it is darkened. I guess the reason the rmses are not as hoped, is because the output images are so dark.

Questions

Why are the costs not decreasing over the epochs and what can I do to get this going?
Why are the output images so dark and what can I do to get them to be better representations of the input images?
How do I get the rmses to be as I expect?

score 0 · Answer 1 · answered Feb 21 '17 at 19:12

0

Your network is clearly not training

Your Autoenconder Layers are not symmetric
Don't use sigmoid, use ReLU
Use a better initialization technique (the specific distribution depends on the activation function)
Share the weight of the Encoder and Decoder layers

answered Feb 21 '17 at 19:12

fabrizioM

46,639
15
102
119

1. What is your recommendation for the layers if I want the output to be linear and have the first hidden layer to be linear? 3. What do you recommend for sigmoid, linear, ReLU? 4. What do you mean? Share them on SO? – Make42 Feb 21 '17 at 20:27
2. Why should I use ReLU instead of sigmoid? – Make42 Feb 21 '17 at 21:24
ReLU has proven to be easier to train than sigmoid. I recommend you study this very good resource to be succesfull in developing neural networks; deeplearningbook.org – fabrizioM Feb 21 '17 at 21:33
How do I initialise with other initialisers? I read that it was a good idea to use the xavier initilalizer. I found https://www.tensorflow.org/api_docs/python/tf/contrib/layers/xavier_initializer. How do I use it in my setup? – Make42 Feb 22 '17 at 18:07

Tensorflow autoencoder: How to get representative output?

Setup

Questions

1 Answers1