What does the global step for the learning rate decay do?

Question

I am following this tutorial:

https://cloud.google.com/architecture/clv-prediction-with-offline-training-train#introduction

and I am rewriting some of the code on Google Colab.

They are using the following for a learning rate decay:

initial_lr = 0.096505
learning_decay_rate = 0.7

lr_schedule = tf.compat.v1.train.exponential_decay(                    
    learning_rate = initial_lr,
    global_step = tf.compat.v1.train.get_global_step(),                                                                         
    decay_steps = checkpoint_steps,
    decay_rate = learning_decay_rate,
    staircase = True)

…and the following modell is which I need to rebuild:

estimator = tf.estimator.DNNRegressor(
    feature_columns = dnn_features,
    hidden_units = [128, 64, 32, 16],
    config = tf.estimator.RunConfig(
      save_checkpoints_steps = checkpoint_steps),
    model_dir = model_dir,
    batch_norm = True,
    dropout = 0.843251,
    optimizer = tfa.optimizers.ProximalAdagrad(
        learning_rate = lr_schedule,                                                
        l1_regularization_strength = 0.0026019,
        l2_regularization_strength = 0.0107146))

tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

I cannot run the model like this, because I get a

ValueError: None values not supported.

…and the reason is the function get_global_step. My results are pretty bad compared to theirs, when I use i.e.:

lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(...)

My questions are:

What exactly is the global_step?
Is it crucial for the model to get better?
In case that I need it: How can I make it work like this?

The `global_step` is a necessary value for the learning rate decay. It tracks the number of batches the graph has seen "globally." The reason it is evaluating to `None` is that it was created outside of the graph. In the example you show, it was evaluated in the [`model_fn`](https://github.com/GoogleCloudPlatform/tensorflow-lifetime-value/blob/0f8c16ea70a2e7da370965e23e9e2154978364fa/clv_mle/trainer/model.py#L162), which is evaluated inside the graph when the model begins training — TayTay, Oct 11 '21 at 20:55
Thank you for your answer @TayTay ! I am pretty new to the whole tensorflow thing. How can I get the global_step evaluated inside the graph? — timmy, Oct 18 '21 at 10:10
Probably important to add: For the DNNRegressor estimator in the example, the model_fn you linked isn't used. @TayTay — timmy, Oct 18 '21 at 10:13

What does the global step for the learning rate decay do?

0 Answers0