2

I'd like to use a neural network to predict a scalar value which is the sum of a function of the input values and a random value (I'm assuming gaussian distribution) whose variance also depends on the input values. Now I'd like to have a neural network that has two outputs - the first output should approximate the deterministic part - the function, and the second output should approximate the variance of the random part, depending on the input values. What loss function do I need to train such a network?

(It would be nice if there was an example with Python for Tensorflow, but I'm also interested in general answers. I'm also not quite clear how I could write something like in Python code - none of the examples I found so far show how to address individual outputs from the loss function.)

Dr. Hans-Peter Störr
  • 25,298
  • 30
  • 102
  • 139

3 Answers3

2

You can use dropout for that. With a dropout layer you can make several different predictions based on different settings of which nodes dropped out. Then you can simply count the outcomes and interpret the result as a measure for uncertainty.

For details, read:

Gal, Yarin, and Zoubin Ghahramani. "Dropout as a bayesian approximation: Representing model uncertainty in deep learning." international conference on machine learning. 2016.

Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
  • Thanks! Interesting and creative approach but sounds rather difficult to implement. Since I found nothing I could just take, I came up with something on my own. What do you think about the direct approach of just using the NN to approximate mean and variance directly using mean_squared_error: https://colab.research.google.com/drive/1CqHvbQifTMrC67hyDx2vU4SyIMt4ADUq ? – Dr. Hans-Peter Störr May 20 '19 at 20:04
2

Since I've found nothing simple to implement, I wrote something myself, that models that explicitly: here is a custom loss function that tries to predict mean and variance. It seems to work but I'm not quite sure how well that works out in practice, and I'd appreciate feedback. This is my loss function:

def meanAndVariance(y_true: tf.Tensor , y_pred: tf.Tensor) -> tf.Tensor :
  """Loss function that has the values of the last axis in y_true 
  approximate the mean and variance of each value in the last axis of y_pred."""
  y_pred = tf.convert_to_tensor(y_pred)
  y_true = math_ops.cast(y_true, y_pred.dtype)
  mean = y_pred[..., 0::2]
  variance = y_pred[..., 1::2]
  res = K.square(mean - y_true) + K.square(variance - K.square(mean - y_true))
  return K.mean(res, axis=-1)

The output dimension is twice the label dimension - mean and variance of each value in the label. The loss function consists of two parts: a mean squared error that has the mean approximate the mean of the label value, and the variance that approximates the difference of the value from the predicted mean.

Dr. Hans-Peter Störr
  • 25,298
  • 30
  • 102
  • 139
1

When using dropout to estimate the uncertainty (or any other stochastic regularization method), make sure to also checkout our recent work on providing a sampling-free approximation of Monte-Carlo dropout.

https://arxiv.org/pdf/1908.00598.pdf

We essentially follow ur idea. Treat the activations as random variables and then propagate mean and variance using error propagation to the output layer. Consequently, we obtain two outputs - the mean and the variance.

jgpostels
  • 26
  • 2