0

I am using the Asynchronous Hyperband scheduler https://ray.readthedocs.io/en/latest/tune-schedulers.html?highlight=hyperband with 2 GPUs. My machine configuration has 2 GPUs and 12 GPUs. But still, only one trail runs at a time whereas 2 trials could simultaneously run at a time.

I specify

ray.init(num_gpus=torch.cuda.device_count())
"resources_per_trial": {
                "cpu": 4,
                "gpu": int(args.cuda)}
  • What is the output of `ray.global_state.cluster_resources()`? Or if you're using the nightly Ray wheels, just `ray.cluster_resources()`. What happens if you make set `"cpu": 1`and `"gpu": 1`? What precisely is `args.cuda`? Maybe that number should just be 1. – Robert Nishihara May 28 '19 at 23:45
  • Same discussion at https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/ray-dev/Q1OAUyIVnfY/otWN4P9YAwAJ – Robert Nishihara May 29 '19 at 01:47

0 Answers0