What's the difference with using get_best_hyperparameters() to generate model and get_best_models()?

Question

After searching hyperparameters, I tried two way to get best model.

One way is using tuner.get_best_hyperparameters() to generate the model as shown in code snippet "A".

Another is using tuner.get_best_models() directly as shown in code snippet "B".

Then I use these two model to predict the same data, the prediction results are quite different.

Why? What's the difference between these two models?

A：

tuner.search(x_train, y_train)
best_hps=tuner.get_best_hyperparameters(1)[0]
best_model_params = build_model(best_hps)
best_model_params.fit(X, Y)
best_model_params.save("best_model_params_2")

B:

tuner.search(x_train, y_train)
models = tuner.get_best_models(num_models=1)
best_model = models[0]
best_model.fit(X, Y)
best_model.save("best_model_2")

Not that this an answer, but on pyimagesearch tutorial on keras tuner they use the option A: https://pyimagesearch.com/2021/06/07/easy-hyperparameter-tuning-with-keras-tuner-and-tensorflow/ — Eduardo, Jul 26 '22 at 12:53

score 1 · Answer 1 · answered Jul 26 '22 at 13:12

Not that this is a definitive answer, as I am not sure of the differences here, but it is based on the official docs and a tutorial. From the keras-tuner "Getting started" and from the PyImageSearch tutorial on keras-tuner, both use option A.

For example, on the official keras-tuner tutorial search for parameters on a subset of the data and train the final model as following:

# Get the top 2 hyperparameters.
best_hps = tuner.get_best_hyperparameters(2)
# Build the model with the best hp.
model = build_model(best_hps[0])
# Fit with the entire dataset.
x_all = np.concatenate((x_train, x_val))
y_all = np.concatenate((y_train, y_val))
model.fit(x=x_all, y=y_all, epochs=1)

On the PyImageSearch tutorial they do the following:

# build the best model and train it
model = tuner.hypermodel.build(bestHP)
H = model.fit(x=trainX, y=trainY,
    validation_data=(testX, testY), batch_size=config.BS,
    epochs=config.EPOCHS, callbacks=[es], verbose=1)
# evaluate the network
predictions = model.predict(x=testX, batch_size=32)
...

Interestingly, on the later, they use tuner.hypermodel.build directly from the tuner class, which is only shown in the API documentation.

The confusing thing is whether the best found is trained or untrained. Model is untrained, as per keras-tuner documentation, it shows the following example for the get_best_hyperparameters method:

Returns the best hyperparameters, as determined by the objective.

This method can be used to reinstantiate the (untrained) best model found during the search process.

Example

best_hp = tuner.get_best_hyperparameters()[0]
model = tuner.hypermodel.build(best_hp)

However, on the get_best_models documentation says it returns "List of trained model instances sorted from the best to the worst", but

For best performance, it is recommended to retrain your Model on the full dataset using the best hyperparameters found during search, which can be obtained using tuner.get_best_hyperparameters(). The models are loaded with the weights corresponding to their best checkpoint (at the end of the best epoch of best trial).

What's the difference with using get_best_hyperparameters() to generate model and get_best_models()?

1 Answers1