0

I have working with deep reinforcement learning and in the literature, usually the learning rates are lower than I found in other settings.

My model is the following one:

def create_model(self):

    model = Sequential()  
    model.add(LSTM(HIDDEN_NODES, input_shape=(STATE_SIZE, STATE_SPACE), return_sequences=False))
    model.add(Dense(HIDDEN_NODES, activation='relu', kernel_regularizer=regularizers.l2(0.000001)))
    model.add(Dense(HIDDEN_NODES, activation='relu', kernel_regularizer=regularizers.l2(0.000001)))
    model.add(Dense(ACTION_SPACE, activation='linear'))

    # Compile the model
    model.compile(loss=tf.keras.losses.Huber(delta=1.0), optimizer=Adam(lr=LEARNING_RATE, clipnorm=1))

    return model

Where the initial learning rate (lr) is 3e-5. For the fine-tuning, I freeze the first two layers (this step is essential in my settings) and decrease the learning rate to 3e-9. During the fine-tuning, the model might suffer from a distributional shift once the source of samples is perturbed data. Is there another source of problems besides this for such a low learning rate to keep my model improving?

HenDoNR
  • 79
  • 1
  • 12

1 Answers1

0

First, Show me your data sample.

Theoretical Answer:

We have learned how perturbation helps in solving various issues related to neural network training or trained model. Here, we have seen perturbation in three components (gradients, weights, inputs) associated with neural-network training and trained model; perturbation, in gradients is to tackle vanishing gradient problem, in weights for escaping saddle point, and in inputs to avoid malicious attacks. Overall, perturbations in different ways play the role of strengthening the model against various instabilities, for example, it can avoid staying at correctness wreckage point since such position will be tested with perturbation (input, weight, gradient) which will make the model approach towards correctness attraction point.

As of now, perturbation is mainly contingent to empirical-experimentation designed from intuition to solve encountering problems. One needs to experiment if perturbing a component of the training process makes sense intuitively, and further verify empirically if it helps mitigate the problem. Nevertheless, in future, we will see more on perturbation theory in deep learning or machine learning in general which might also be backed by a theoretical guarantee.

Faisal Shahbaz
  • 431
  • 3
  • 12