0

I'm trying to understand and improve the loss and accuracy of the variational autoencoder. I filled the autoencoder with a simple binary data:

data1 = np.array([0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
   1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   0, 0, 0, 0, 0, 0, 0, 0], dtype='int32')

data2 = np.array([1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
   1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   0, 0, 0, 0, 0, 0, 0, 0], dtype='int32')


data3 = np.array([0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
   1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   0, 0, 0, 0, 0, 0, 0, 0], dtype='int32')

100 samples each, so I have 300 samples.

I tried to predict with Variational Autoencoder

sent_encoded = encoder.predict(np.array(test), batch_size = batch_size)
sent_decoded = generator.predict(sent_encoded)

and got correct answers for a few rows

print(np.round_(sent_decoded[1]))
print(np.round_(sent_decoded[100]))
print(np.round_(sent_decoded[299]))

[ 0.  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.
  0.  0.  0.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
[ 0.  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.
  0.  0.  0.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
[ 1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  0.
  0.  0.  0.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]

What I don't understand is the loss, accuracy and the mse loss of the model training.

I got pretty nice loss chart

enter image description here

but why accuracy of the model is not so great on that simple dataset? Just look at it enter image description here

The mse loss doesn't change it and it is pretty high: enter image description here

What I can do to get 100% accurate model? Does variational autoencoder is capable to get me 100% accurate model with this type of data? Show me with the code please.

Ioannis Nasios
  • 8,292
  • 4
  • 33
  • 55
MarioZ
  • 320
  • 4
  • 17

1 Answers1

2

Variational autoencoder is not a classifier, so accuracy doesn't actually make any sense here.

Measuring VAE's loss by mean-squared reconstruction error could be also problematic. To put it shortly, VAE doesn't only optimize reconstruction loss.

You need to read more about what Variational Autoencoder is, and specifically what it optimizes. If you're just interested in classification, then maybe just pretraining regular autoencoder and then classifier will make more sense.

Jakub Bartczuk
  • 2,317
  • 1
  • 20
  • 27
  • I want to use it for anomaly detection. Suppose if I have some other pattern in a test part of dataset, other then data1,data2 or data3, But you are right, loss is the acceptable measure, but why is so high (0.94) for that simple dataset? – MarioZ Apr 11 '18 at 09:26
  • Why loss chart is not converging to zero if I have 100% precise decoding of the test part of the dataset? I use custom loss layer. – MarioZ Apr 25 '18 at 10:12