Does the order of terms in MSE matter in case of differentiation?

Question

Mean squared error is a popular cost function used in machine learning:

(1/n) * sum(y - pred)**2

Basically the order of subtraction terms doesn't matter as the whole expression is squared.

But if we differentiate this function, it will no longer be squared:

2 * (y - pred)

Would the order make a difference for a neural network?

In most cases reversing the order of the terms y and pred would change the sign of the result. As we use the result to compute the slope of the weight - would it influence the way the neural network converges?

cheersmate · Accepted Answer · 2018-10-30T13:15:41.117

2

Well, actually

and

so they're the same.

(I took the derivative w.r.t. y_i assuming those are the network outputs but of course the same holds if you derive by \hat{y}_i.)

edited Oct 30 '18 at 13:15

answered Oct 30 '18 at 12:33

cheersmate

2,385
4
19
32

Does the order of terms in MSE matter in case of differentiation?

1 Answers1