It is common to have two objectives you want your model to optimize, and the typical solution is to combine them into a single cost function by taking a weighted sum:
C = a*C1 + b*C2
Where a
and b
are selected to make sure one term doesn't dominate.
This way you can compute a single gradient and use that to update your weights on each training step.
EDIT: If you are not getting good results then either 1. you are not weighting the components correctly and one is dominating the gradients, or 2. the functions simply aren't compatible and there is no good way to minimize both at the same time. @sascha gave a trivial example of this in the comments: C1 = x and C2 = -x.
EDIT 2: If you are updating your parameters with respect to the gradient of C1 and then with respect to the gradient of C2, then this is very similar to what I suggested, since the gradient of the sum is the sum of the gradients.
However if you are recomputing the gradient after the first step, then this could lead to unstable solutions, because the second step could be "undoing" the work of the first step or vice versa at every iteration, especially if the cost functions are not compatible. In my method you are more likely to arrive at a minimum. But both methods are very similar, and if you are getting problems it is likely due to one of the reasons I mentioned.