UPDATE:
The original MSE implementation looks like as follows:
def mean_squared_error(y_true, y_pred):
if not K.is_tensor(y_pred):
y_pred = K.constant(y_pred)
y_true = K.cast(y_true, y_pred.dtype)
return K.mean(K.square(y_pred - y_true), axis=-1)
I think the correct maximizer loss function:
def mean_squared_error_max(y_true, y_pred):
if not K.is_tensor(y_pred):
y_pred = K.constant(y_pred)
y_true = K.cast(y_true, y_pred.dtype)
return K.mean(K.square(1 / (y_pred - y_true)), axis=-1)
This way we get always a positive loss value, like in the case of the MSE function, but with reversed effect.
UPDATE 2:
Initially I wrote, that the intuitive first thought to simply negate the loss will NOT give the result what we expected because of the base concept of the optimizing methods (you can read an interesting discussion here).
After I double checked both method head to head the result in a particular learning task (Note: I didn't do an all-out test) was that both method gave the loss maximization, though the -loss
approach converged a bit faster. I am not sure if it always gives the best solution or any solution because of the possible issue described here.
If someone has other experience, please let me know.
So if somebody want to give a try to -loss
too:
def mean_squared_error(y_true, y_pred):
if not K.is_tensor(y_pred):
y_pred = K.constant(y_pred)
y_true = K.cast(y_true, y_pred.dtype)
return - K.mean(K.square(y_pred - y_true), axis=-1)
Additional details:
OP wrote:
I have a generative adversarial networks, where the discriminator gets
minimized with the MSE and the generator should get maximized. Because
both are opponents who pursue the opposite goal.
From the link provided by Ibragil:
Meanwhile, the generator is creating new, synthetic images that it
passes to the discriminator. It does so in the hopes that they, too,
will be deemed authentic, even though they are fake. The goal of the
generator is to generate passable hand-written digits: to lie without
being caught. The goal of the discriminator is to identify images
coming from the generator as fake.
So this is an ill-posed problem:
In GAN our final goal to train our two counter-parties the discriminator and the generator to perform as good as possible against each other. It means, that the two base learning algorythm have different tasks but the loss function with which they can achieve the optimal solution is the same i.e. binary_crossentropy
, so the models' tasks are to minimize this lost.
A discriminator model's compile method:
self.discriminator.compile(loss='binary_crossentropy', optimizer=optimizer)
A generator model's compile method:
self.generator.compile(loss='binary_crossentropy', optimizer=optimizer)
It is the same like two runner's goal to be minimized their time of reaching the finish even so they are competitors in this task.
So the "opposite goal" doesn't mean opposite task i.e. minimizing the loss (i.e. minimizing the time in the runner example).
I hope it helps.