1

Hello StackOverflow people,

I am encountering a problem where I don't know what else I can try. First off, I am using a custom loss function (at least I believe that is the problem, but maybe it's something different?) for a mixture density network:

def nll_loss(mu, sigma, alpha, y):
    gm = tfd.MixtureSameFamily(
        mixture_distribution=tfd.Categorical(probs=alpha),
        components_distribution=tfd.Normal(
            loc=mu,
            scale=sigma))

    log_likelihood = gm.log_prob(tf.transpose(y))

    return -tf.reduce_mean(log_likelihood, axis=-1)

The funny thing is, the network randomly collapses after a varying amount of training. The things I already tried and checked:

  • All my input data is scaled between 0 and 1 (both x and y)
  • I tried multiplying the y's and adding an integer to those, so the distance to zero is increased
  • Different optimizers
  • Clipping optimizers
  • Clipping loss function
  • Setting Learning Rate to 0! (That ones puzzles me the most, as I am sure my inputs are correct)
  • Adding Batch Normalization to every layer of my network

Does anyone have an idea why this is happening? What am I missing? Thank you!

0 Answers0