what is wrong when training an autoencoder on mnist dataset with caffe?

Question

I want to use mnist dataset to train a simple autoencoder in caffe and with nvidia-digits.

I have: caffe: 0.16.4 DIGITS: 5.1 python 2.7

I use the structure provided here: https://github.com/BVLC/caffe/blob/master/examples/mnist/mnist_autoencoder.prototxt

Then I face 2 problems:

When I use the provided structure I get this error:

 Traceback (most recent call last):
   File "digits/scheduler.py", line 512, in run_task
     task.run(resources)
   File "digits/task.py", line 189, in run
     self.before_run()
   File "digits/model/tasks/caffe_train.py", line 220, in before_run
     self.save_files_generic()
   File "digits/model/tasks/caffe_train.py", line 665, in save_files_generic
     'cannot specify two val image data layers'
 AssertionError: cannot specify two val image data layers

when I remove the layer for ''test-on-test'', I get a bad result like this: https://screenshots.firefox.com/8hwLmSmEP2CeiyQP/localhost

What is the problem??

score 1 · Answer 1 · answered Feb 25 '18 at 04:34

The first problem occurs because the .prototxt has two layers with name data and TEST phase. The first layer that uses data, i.e. flatdata, does not know which data to use (the test-to-train or test-to-test). That's why when you remove one of the data layers with TEST phase, the error does not happen. Edit: I've checked the solver file and it has a test_stage parameter that should switch between the test files, but it's clearly not working in your case.

The second problem is a little more difficult to solve. My knowledge in autoencoders is limited. It seems your euclidean loss changes very little during your iterations; I would check the base learning rate in your solver.prototxt and decrease it. Check how the losses fluctuate.

Besides that, for the epochs/iterations that achieved a low error, have you checked the output data/images? Do they make sense?

what is wrong when training an autoencoder on mnist dataset with caffe?

1 Answers1