2

I am using keras tuner for hyperparameter tuning of ANN model. I am changing the no of layers in the model between 2 and 5, and nodes between 10 and 30. I am using random search model of keras tuner to select the best performing models. When I observed the results, the top results are as follows:

Best val_mse So Far: 0.03936238835255305
Total elapsed time: 01h 23m 53s
Results summary
Results in project\Sd attempt10
Showing 10 best trials
Objective(name='val_mse', direction='min')
Trial summary
Hyperparameters:
num_layers: 3
units_0: 15
units_1: 15
learning_rate: 0.001
units_2: 10
Score: 0.03936238835255305
Trial summary
Hyperparameters:
num_layers: 3
units_0: 15
units_1: 25
learning_rate: 0.01
units_2: 20
units_3: 25
units_4: 10
Score: 0.03974008063475291

we can see that in the second trial summary, though no of layers are 3, units are present for 5 hidden layers.

I have also attached the code snippet for arriving the result.

def build_model(hp):
    model = keras.Sequential()
    for i in range(hp.Int('num_layers', 2, 5)):
        model.add(layers.Dense(units=hp.Int('units_' + str(i),
                                            min_value=10,
                                            max_value=30,
                                            step=5),
                               activation='relu'))
    model.add(layers.Dense(1, activation='linear'))
    model.compile(
        optimizer=keras.optimizers.Adam(
            hp.Choice('learning_rate', [1e-2, 1e-3, 1e-4])),
        loss='mse',#'mean_absolute_error',
        metrics=['mse','mae'])#['mean_absolute_error'])
    return model


tuner = RandomSearch(
    build_model,
    objective='val_mse',
    max_trials=25,
    executions_per_trial=3,
    directory='project',
    project_name='Sd attempt10')
tuner.search(X_train, y_train,
             epochs=100,
             validation_data=(Xnew, yact))
print(tuner.results_summary())

Can some one explain this behavior with the results.

srinivas
  • 301
  • 1
  • 9

1 Answers1

0

The reason for this is because you have trained on the model on an instance where there might be 5 layers. Now it has to create an optimal number of units for that hidden layer, so that is stored e.g. "num_hidden_units_5" = 30. As the tuner keeps going it might find that 3 hidden layers is optimal, but it will still keep the optimal value for the number of hidden units when there are 5 hidden layers.

The model does not have some "invisible" layers, its simply keeping track of the optimal variables, but when you change the layer sizes some parameters no longer become applicable.

Governor
  • 300
  • 1
  • 10