0

I am new to Deep learning and would like to understand on the below points. Can you please help.

  1. If I give number of epochs as 100 to train and try to evaluate the model, does it take the best epoch model or the final model after 100 epochs.

  2. In history, I am seeing loss and val loss. Do the model try to minimize only the training loss and just show the val_loss for our reference, similar to the metrics it shows.

  3. If I use Keras Tuner (RandomSearch), there is an objective function. I am confused whether the model try to reduce the loss provided during compile or the loss provided in the tuner objective.

Can you please clarify on the above points.

Jegatheesan
  • 105
  • 2
  • 8

1 Answers1

1

The high value for epoch will only lead into high accuracy and lowest loss for training dataset, but the important thing that you should watch during training on data is the val_loss and val_metric; In most cases if the model continue training on the data will cause overfitting on the validation data (the validation data are not seen by the model and just evaluatted by model after an epoch) so the high value for epochs wont lead into better model

so the most important thing to notice is the val_loss and discontinue or break model training if you notice continuos increasing in val_loss; so you could implement a callback (EarlyStopping) to stop model training whenever the increasing in val_loss is watched.

Soroush Mirzaei
  • 331
  • 2
  • 12
  • Thank you Soroush for your input. So if I use epoch as 50 and once the training is done with all those 50 epoch and try to predict, will it choose the model available after 50 epoch or it chooses the best model among 50 epochs to predict. – Jegatheesan Aug 28 '22 at 10:41
  • If train model for 50 epochs without EarlyStopping callback; it will return model that its evaluation on training and and testing set will be same as the 50th epoch, so it wont return the best model; but if you try out to implement the callback said before, will return the best model based on parameters that you have set for keras.callbacks.EarlyStopping(monitor = 'val_loss', patience, restore_best_weights=True), for more details check out this link [EarlyStopping](https://keras.io/api/callbacks/early_stopping) – Soroush Mirzaei Aug 28 '22 at 11:09
  • In simple words, by training model, the tensorflow start a session and trying to minimize the loss value (loss determines the model's confidence about predictions) so all the things is about trying to find out best weights for the neurons by minimizing loss value, in this way the model will use a gradient tape to calculate gradient of loss versus model trainable variables, and then apply these gradients by an optimizer to these model's trainable variables, this process repeat multiple times based on the epochs you have set before in model.fit() – Soroush Mirzaei Aug 28 '22 at 11:16
  • 1
    Thank you so much. I believe Gradient descent try to minimize the training loss and not the validation loss. Can you please confirm. Also any input you can give for my 3rd question regarding keras tuner. – Jegatheesan Aug 28 '22 at 11:28
  • So if you train model and both loss value and the metric you have defined (for instance accuracy value) were high, it doesn't mean it will the best model; because the model is supposed to have great performance on the data that didn't have seen before, if loss is high and accuracy is also high means model predict correctly but the model has doubt in its predictions. – Soroush Mirzaei Aug 28 '22 at 11:29
  • Absolutely right, model just train on training set and evaluate on validation set. and even reduce the loss provided in the tuner objective. – Soroush Mirzaei Aug 28 '22 at 11:32
  • Thank you, Soroush for the clarification – Jegatheesan Aug 28 '22 at 11:43