I'm working on a regression task using a neural network implemented in Keras. I trained the model for 1000 epochs, and on the last epoch, I obtained a mean absolute error (MAE) value of 3.8 However, when I performed cross-validation using cross_validate function, the resulting MAE values are much higher.
Here's a simplified version of my code:
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import cross_validate
from keras.models import Sequential
from keras.layers import Dense
# ... (data preprocessing steps)
def build_regressor():
# ... (model architecture definition)
regressor = KerasRegressor(build_fn=build_regressor, batch_size=10, epochs=1000)
scoring = ['neg_mean_absolute_error']
results = cross_validate(estimator=regressor, X=x_train_scaled, y=y_train, cv=5, scoring=scoring)
and this the output of this code :
as you see in the last epoch : mae ( mean absolute error ) = 3.8
The results I obtained from the cross-validation (-results['test_neg_mean_absolute_error']) show higher MAE values than the training MAE of 3.8 from the last epoch. I expected the cross-validation results to be similar or at least in the same range as the training MAE.
this is the unexpected result :
-results['test_neg_mean_absolute_error']
array([13.15011832, 9.84094066, 9.6454553 , 11.37415547,
13.12120939])
I'm wondering why there is such a difference between the training MAE and the cross-validation results, especially considering the low MAE achieved on the last epoch. Am I missing something or should I consider other factors when interpreting the cross-validation results?