Let's say I have an input dataset of size n = 100 observations, m = 5 features where the last feature is the dependent variable and rest four are independent variable. It is a regression problem, which I intend to solve using a Neural Network. After hyperparameter optimization it came out that this specific problem has a best model, which has 1 hidden layer with 2 neurons and of course an output layer with just 1 neuron (outputting y_hat).
Everything goes well, the model is regularized to avoid overfitting and the prediction error shows satisfactory results too. Also the 10 fold cross-validation score compared to a regular multilinear regression technique is far better. So yeah, I want to keep this model as the best model.
The question is how can we now compute the coefficient values when the dimension of the data when passing through the hidden layer gets reduced.
The solution I came up with was the backward movement and solving the equation from the output layer. That is if C1
, B1
, B2
are the outputs from the last and hidden layers respectively and
C1 = c_0 + c_1*B1 + c_2*B2
;
B1 = b1_0 + b1_1*x1 + b1_2*x2 + b1_3*x3 + b1_4*x4
;
B2 = b2_0 + b2_1*x1 + b2_2*x2 + b2_3*x3 + b2_4*x4
;
where, c_0
, b1_0
, b2_0
are intercepts for output neurons and hidden neurons respectively, and
c_1
, c_2
are slopes for the output layer equation;
b1_1
, b1_2
, b1_3
, b1_4
are slopes for the first hidden neuron and
b2_1
, b2_2
, b2_3
, b2_4
are slopes for the second hidden neuron.
Now to find the real coefficients of the variables can we substitute values in given form?
Coefficients = c_0 + c_1*(b1_0 + b1_1 + b1_2 + b1_3 + b1_4) + c_2*(b2_0 + b2_1 + b2_2 + b2_3 + b2_4)
, which when solved gives us:
Coefficients = c_0 + c_1*b1_0 + c_2*b2_0 + c_1*b1_1 + c_1*b1_2 + c_1*b1_3 + c_1*b1_4 + c_2*b2_1 + c_2*b2_2 + c_2*b2_3 + c_2*b2_4
Where:
c_0 + c_1*b1_0 + c_2*b2_0
= intercept of the final equation
c_1*b1_1 + c_2*b2_1
= Coefficient for variable x1
c_1*b1_2 + c_2*b2_2
= Coefficient for variable x2
c_1*b1_3 + c_2*b2_3
= Coefficient for variable x3
c_1*b1_4 + c_2*b2_4
= Coefficient for variable x4
Please tell me if I am right and this makes sense?