1

I'm new to programming/ray and have a simple question about which parameters can be specified when using Ray Tune. In particular, the ray tune documentation says that all of the auto-filled fields (steps_this_iter, episodes_this_iter, etc.) can be used as stopping conditions or in the Scheduler/Search Algorithm specification.

However, the following only works once I remove the "episodes_this_iter" specification. Does this work only as part of the stopping criteria?

ray.init()
tune.run(
    PPOTrainer,
    stop = {"training_iteration": 1000},
    config={"env": qsdm.QSDEnv,
          "env_config": defaultconfig,
            "num_gpus": 0,
            "num_workers": 1,
            "lr": tune.grid_search([0.00005, 0.00001, 0.0001]),}, 
    "episodes_this_iter": 2500, 
)
sbrand
  • 11
  • 1

1 Answers1

0

tune.run() is the one filling up those fields so we can use them elsewhere. And the stopping criterion is just one of the places where we can use them in.

To see why the example doesn't work, consider a simpler analogue: episodes_total: 100

The trainer itself is the one incrementing the episode count so the rest of the system knows how far along we are. It doesn't work on them if we try to change it or fix it to a particular value. The same reasoning applies to other fields in the list.


As for the scheduler and search algorithms, I have no experience with. But what we want to do is put those conditions inside the schedule or search algorithm itself, and not in the trainer directly.

Here's an example with Bayesian optimisation search, although I don't know what it would mean to do this:

from ray.tune.suggest.bayesopt import BayesOptSearch

tune.run(

        # ...

        # 10 trials
        num_samples=10,

        search_alg=BayesOptSearch(

                # look for learning rates within this range:
                {'lr': (0.1, 0.00001)},

                # optimise for this metric:
                metric='episodes_this_iter',  # <------- auto-filled field here
                mode='max',

                utility_kwargs={
                        'kind': 'ucb',
                        'kappa': '2.5',
                        'xi': 0.0
                }
        )
)
WaterGenie
  • 119
  • 1
  • 9