3

I'm working on training a model by using Transfer Learning. The pre-trained model I use is "SSD MobileNet V2 FPNLite 320x320" from Model Zoo TF2. And I have some confusion about item train_config, in file pipeline.config.

If I change num_steps, model will train with num_steps. But when I change total_steps, the model still train with num_steps. Even if I setnum_steps > total_step, there is no error. And when I check all SSD model in Model Zoo TF2, I always see that total_steps the same as num_steps.

  • Question: Do I need to set total_steps the same with num_steps? What is the relationship between it?
train_config {
  batch_size: 128
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    random_crop_image {
      min_object_covered: 0.0
      min_aspect_ratio: 0.75
      max_aspect_ratio: 3.0
      min_area: 0.75
      max_area: 1.0
      overlap_thresh: 0.0
    }
  }
  sync_replicas: true
  optimizer {
    momentum_optimizer {
      learning_rate {
        cosine_decay_learning_rate {
          learning_rate_base: 0.07999999821186066
          total_steps: 50000
          warmup_learning_rate: 0.026666000485420227
          warmup_steps: 1000
        }
      }
      momentum_optimizer_value: 0.8999999761581421
    }
    use_moving_average: false
  }
  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED"
  num_steps: 50000
  startup_delay_steps: 0.0
  replicas_to_aggregate: 8
  max_number_of_boxes: 100
  unpad_groundtruth_tensors: false
  fine_tune_checkpoint_type: "classification"
  fine_tune_checkpoint_version: V2
}
Hoang97
  • 47
  • 3

1 Answers1

0

The num_steps parameter in the training configuration specifies the number of steps to train the model for. On the other hand, the total_steps parameter is used in conjunction with learning rate schedules that decay the learning rate as training progresses.

In your case, the optimizer uses cosine decay learning rate schedule which decays the learning rate as training progresses based on the total number of steps specified by total_steps. If you don't specify this parameter, the learning rate will not decay.

If you set num_steps greater than total_steps, then the model will stop training once it completes num_steps. But if you set total_steps greater than num_steps, the learning rate will continue to decay beyond the point where training stops. In other words, total_steps is only used as a reference for the learning rate schedule and doesn't affect the actual number of training steps.

It's generally recommended to set total_steps greater than or equal to num_steps so that the learning rate continues to decay until the end of training.