0

I am working on a Keras model with a custom loss function provided by a Mixture Density Network final layer (the loss tries to minimize the negative log likelihood of some Gaussian models).

What confuses me is the loss will sometimes hit an epoch in which it returns -inf as the resulting loss. Then the next iteration the loss will be a number again (e.g. -2.1). The loss sometimes bounces between negative infinity and a number every other epoch.

The negative loss is evidently to be expected with a NLL loss, but this fluctuation is confusing to me. What explains this behavior within Keras? My understanding is the -inf loss is caused by numeric underflow somewhere, but I'm not sure how the model can recover from this and re-establish numeric stability thereafter.

Does anyone know how this works? I'd be very grateful for any suggestions others can offer on this question.

duhaime
  • 25,611
  • 17
  • 169
  • 224
  • Did you try it with lower learning rate? This type of fluctuations often are because of the too high learning rate. – Geeocode Nov 04 '18 at 20:09
  • Yes, but my learning rate is now 0.000001 -- is that reasonable for a learning rate? Simpler (smaller) models I've built for trivial tasks have been able to use much larger learning rates. Is 0.000001 a reasonable lr? – duhaime Nov 04 '18 at 20:21
  • It highly depends on the loss function, but not too frequent I guess. – Geeocode Nov 04 '18 at 20:29

0 Answers0