8

I try retrain TF Object Detection API model from checkpoint with already .config file for training pipeline with tf.estimator.train_and_evaluate() method like in models/research/object_detection/model_main.py. And it saves checkpoints every N steps or every N seconds.

But I want to save only one best model like in Keras. Is there some way to do it with TF Object Detection API model? Maybe some options/callbacks for tf.Estimator.train or some way to use Detection API with Keras?

Silicon
  • 306
  • 3
  • 11
  • I have to say that both solutions from Sharky and from prouast are correct. I tried up both of them and it works fine. So if somebody have the same issue as I had, you can use any of this options. Thanks a lot to Sharky and prouast for clear and useful answers! – Silicon Mar 07 '19 at 07:18
  • Thanks for this - but where do I have to put that code from thse answers? I am using the default `model_main_tf2.py` script from the Object Detection API for training. Do I need to modify the OD scripts themselves? – Matthias Mar 02 '21 at 13:50

3 Answers3

7

I have been using https://github.com/bluecamel/best_checkpoint_copier which works well for me.

Example:

best_copier = BestCheckpointCopier(
   name='best', # directory within model directory to copy checkpoints to
   checkpoints_to_keep=10, # number of checkpoints to keep
   score_metric='metrics/total_loss', # metric to use to determine "best"
   compare_fn=lambda x,y: x.score < y.score, # comparison function used to determine "best" checkpoint (x is the current checkpoint; y is the previously copied checkpoint with the highest/worst score)
   sort_key_fn=lambda x: x.score,
   sort_reverse=False) # sort order when discarding excess checkpoints

pass it to your eval_spec:

eval_spec = tf.estimator.EvalSpec(
   ...
   exporters=best_copier,
   ...)
prouast
  • 1,176
  • 1
  • 12
  • 17
  • Thanks for this - but where do I have to put that code? I am using the default `model_main_tf2.py` script from the Object Detection API for training. Do I need to modify the OD scripts themselves? – Matthias Mar 02 '21 at 13:48
6

You can try using BestExporter. As far as I know, it's the only option for what you're trying to do.

exporter = tf.estimator.BestExporter(
      compare_fn=_loss_smaller,
      exports_to_keep=5)

eval_spec = tf.estimator.EvalSpec(
    input_fn,
    steps,
    exporters)

https://www.tensorflow.org/api_docs/python/tf/estimator/BestExporter

Sharky
  • 4,473
  • 2
  • 19
  • 27
  • 1
    I see that the exporter exports files in a slightly different format `saved_model.pb`.. and the meta file (example: `model.ckpt-3073.meta`) is missing... are they equivalent to use? I remember `*.pb` is generated after exporting frozen inference graph. – Pratik Khadloya Apr 20 '20 at 00:45
  • `BestExporter` also saves ckpt, despite that it is not copied to a new folder, which is doomed to be cleared if new ckpts are generated (say `keep_checkpoint_max=5` in `tf.estimator.RunConfig`), where @prouast 's answer will fit. – MeadowMuffins Dec 21 '20 at 06:38
1

If you are training using the models repo of tensorflow/models. models/research/object_detection/model_lib.py file create_train_and_eval_specs function can be modified to include the best exporter:

final_exporter = tf.estimator.FinalExporter(
    name=final_exporter_name, serving_input_receiver_fn=predict_input_fn)

best_exporter = tf.estimator.BestExporter(
    name="best_exporter",
    serving_input_receiver_fn=predict_input_fn,
    event_file_pattern='eval_eval/*.tfevents.*',
    exports_to_keep=5)
exporters = [final_exporter, best_exporter]

train_spec = tf.estimator.TrainSpec(
    input_fn=train_input_fn, max_steps=train_steps)

eval_specs = [
    tf.estimator.EvalSpec(
        name=eval_spec_name,
        input_fn=eval_input_fn,
        steps=eval_steps,
        exporters=exporters)
]
Pratik Khadloya
  • 12,509
  • 11
  • 81
  • 106