0

Setup

I have a model with 3 inputs and 2 outputs (figure below). I have a defined loss per each output, but then I want to add a regularization term to each loss which is a function of two outputs:

L_V = MSE(v,y_v) + lambda_ * f(v, q)
L_Q = MSE(q,y_q) + lambda_ * f(v, q)

the regularizer f(v, q) is like an additional restriction, e.g. let's say I want to solve a trade-off problem of fitting Q and V, but also minimizing the dot product of v.q.

network

Question

Without regularizer, I can pass my two losses in model.compile(loss = [v_loss, q_loss]). But how can I define the regularizer? My main challenge is how to read the value of other output in the custom v_loss function, to evaluate f(v,q) on that.

What I tried and failed

I concatenated V and Q in a single output, and returned a loss of L_v + L_q + L_regu. but the network doesn't learn anything even for the simplest linear data with enough iteration. I think the main problem is that the Q network is trained also by L_v and likewise, V network is trained also by L_q, which is wrong.

Alireza
  • 656
  • 1
  • 6
  • 20
  • Maybe you have an error in your custom loss function. Or maybe the `L_regu` is much larger than `L_v` and `L_q`, and so you are minimizing only `L_regu` but not `L_v` and `L_q` – AndrzejO Sep 22 '22 at 20:17
  • on the contrary, `L_regu` is smaller even in the order of magnitude. Even when I set lambda to really small value, it doesn't train. as I mentioned, I think the loss `L_v+L_q` is being applied to both `v` and `q`. I'm not sure though if this is the reason – Alireza Sep 22 '22 at 20:23
  • I don't think this is the issue, because the network "knows" how your loss is calculated. Adding L_q to L_v has no effect on how the weights of V affect the loss, so it doesn't affect the gradient of the V weights – AndrzejO Sep 22 '22 at 20:32
  • In this case, (ignoring completely the regularizer) if I have to outputs `V` and `Q` and two separate loss of `L_v` and `L_q`, I must get the same results as if I have concatenated output `[V,Q]` and one single loss `L_v+L_q`? I did both and the network wasn't trained in latter but trained in the former. But I'll double check to make sure it isn't a silly mistake – Alireza Sep 22 '22 at 20:39

0 Answers0