I think they both mean the same thing. Let us denote your predictions for i^th
sample by [x_i, y_i, z_i, xx_i, yy_i, zz_i]
. The true values are denoted by [t_x_i, t_y_i, t_z_i, t_xx_i, t_yy_i, t_zz_i]
Over a batch of N
samples, you want to minimize:
L = \sum_i=1^N ((x_i-t_x_i)^2)/N + ... + \sum_i=1^N ((zz_i-t_zz_i)^2)/N
The MSE loss will minimize the following:
L = (1/N) * \sum_i=1^N ((1/6) * [(x_i - t_x_i)^2 + ... + (zz_i-t_zz_i)^2])
You can see that both finally minimize the same quantity.
I think this will stand true in case your six outputs are independent variables, which I think they are, since you model them as six distinct outputs with six ground truth labels.