Hello StackOverflow people,
I am encountering a problem where I don't know what else I can try. First off, I am using a custom loss function (at least I believe that is the problem, but maybe it's something different?) for a mixture density network:
def nll_loss(mu, sigma, alpha, y):
gm = tfd.MixtureSameFamily(
mixture_distribution=tfd.Categorical(probs=alpha),
components_distribution=tfd.Normal(
loc=mu,
scale=sigma))
log_likelihood = gm.log_prob(tf.transpose(y))
return -tf.reduce_mean(log_likelihood, axis=-1)
The funny thing is, the network randomly collapses after a varying amount of training. The things I already tried and checked:
- All my input data is scaled between 0 and 1 (both x and y)
- I tried multiplying the y's and adding an integer to those, so the distance to zero is increased
- Different optimizers
- Clipping optimizers
- Clipping loss function
- Setting Learning Rate to 0! (That ones puzzles me the most, as I am sure my inputs are correct)
- Adding Batch Normalization to every layer of my network
Does anyone have an idea why this is happening? What am I missing? Thank you!