So from what I've understood the formula of the MSE is: MSE= 1/n * ∑(t−y)^2, where n is the number of training sets, t is my target output and y my actual output. Let's say I had 2 training sets each with 1 output:
[0;0] t=[0] y=[1]
[1;1] t=[1] y=[1]
If I apply the MSE I would get MSE = 1/2 * [(0-1)^2 + (1-1)^2] = 1/2
But what if I have more than 1 output? Do I calculate the MSE of each training set and then I calculate the mean of all the MSEs I got?