-1

I want to be able to throw at some ray workers a lot of data collection tasks where a trainer is working concurrently and asynchronously on another cpu training on the collected data, the notion resembles this example from the docs: https://docs.ray.io/en/master/auto_examples/plot_parameter_server.html#asynchronous-parameter-server-training

Difference is I don't want to hang waiting for the next sample to arrive, blocking me from assigning a new task (with the ray.wait in the attached example), but throw at the pool a lot of samples and condition the trainer's training process to start only when at least N samples were collected using the data collection tasks.

How can I do that using ray?

Gabizon
  • 339
  • 4
  • 15

1 Answers1

0

Can you take a look at e.g. DQN's or SAC's execution plan in RLlib? ray/rllib/agents/dqn/dqn.py::execution_plan().

E.g. DQN samples via the remote workers and puts the collected samples into the buffer, while - at the same time - sampling from that buffer and doing learning updates on this buffer-sampled data. You can also set the "training intensity", the ratio between time steps sampled and time steps trained.

SAC works the same. APEX-DQN on the other hand uses distributed replay buffers to allow for even faster sample storage and retrieval.