I have a Keras deep learning model that outputs 6 variables.
model = Sequential()
model.add(Dense(32, input_dim=12, kernel_initializer='he_uniform', activation='relu'))
model.add(Dense(256, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(6, activation = 'linear'))
model.compile(loss = 'MSE', optimizer = 'adam')
How are the errors for these six variables compiled into one MSE? Should there not be an MSE for each of the six variables? Is it scaling the 6 and averaging them?