How to log more frequently than evaluating with `ray.tune.Trainable`

Question

I am interested in using the tune library for reinforcement learning and I would like to use the in-built tensorboard capability. However, the metric that I am using to tune my hyperparameters is based on a time-consuming evaluation procedure that should be run infrequently.

According to the documentation, it looks like the _train method returns a dictionary that is used both for logging and for tuning hyperparameters. Is it possible either to perform logging more frequently within the _train method? Alternately, could I return the values that I wish to log from the _train method but some of the time omit the expensive-to-compute metric from the dictionary?

score 1 · Accepted Answer · answered Aug 14 '19 at 03:55

1

One option is to use your own logging mechanism in the Trainable. You can log to the trial-specific directory (Trainable.logdir). If this conflicts with the built-in Tensorboard logging, you can remove that by setting tune.run(loggers=None).

Another option is to, as you mentioned, some of the time omit the expensive-to-compute metric from the dictionary. If you run into issues with that, you can also return "None" as the value for those metrics that you don't plan to compute in a particular iteration.

Hope that helps!

answered Aug 14 '19 at 03:55

richliaw

1,925
16
14

Looking for example at https://github.com/ray-project/ray/blob/1eaa57c98f8870a43e1ea14ec011b6bd4be97c8d/python/ray/tune/schedulers/async_hyperband.py#L100, it seems like the Schedulers expect the chosen metric to be present in the `results` dict and to be a scalar value. – ethanabrooks Aug 14 '19 at 20:12
Ah I see; you want to use that metric for optimization. One option here is to use a dummy value (i.e., the last seen value) and to provide the more frequently calculated values. – richliaw Aug 15 '19 at 04:26
That's a great idea! That's what I'll do. – ethanabrooks Aug 15 '19 at 13:37

How to log more frequently than evaluating with `ray.tune.Trainable`

1 Answers1