0

I've created a learner(a remote object decorated by @ray.remote) in one python process, and now I want to create a worker in a new process(run by python new_file.py either in the same machine or in a different machine) and connect it to that learner. How can I achieve this using ray? Assuming the new worker is on the same machine as the learner, but answers for they are on different machines are also appreciated.

Maybe
  • 2,129
  • 5
  • 25
  • 45
  • I'd suggest looking at actors and using the fact that you can pass around "actor handles". See https://ray.readthedocs.io/en/latest/actors.html and https://ray.readthedocs.io/en/latest/actors.html#passing-around-actor-handles – Robert Nishihara Nov 27 '19 at 18:14
  • @RobertNishihara Thanks for responding and sorry for that I did not make things clear. The new worker is created by running `python new_file.py`, either in the same machine or in a different machine. In that case, I don't know how to connect it to the process that runs the learner and how to get the "learner handler". – Maybe Nov 27 '19 at 22:48
  • I see. You can share actors between multiple Ray applications running on the same Ray cluster. Have both drivers connect to the same Ray cluster using `ray.init(...)`, e.g., `ray.init(address='auto')` and then you can use the named actor API in https://github.com/ray-project/ray/blob/master/python/ray/experimental/named_actors.py. Note that this API may change a bit in the future (though the functionality will continue to exist). – Robert Nishihara Nov 30 '19 at 07:31
  • Hi, @RobertNishihara, I wrote an example in the answer. Do you think it's an appropriate solution? – Maybe Dec 02 '19 at 10:23

1 Answers1

0

Thanks for help, @RobertNishihara.

Here's an example I wrote according to https://ray.readthedocs.io/en/latest/advanced.html#detached-actors, https://github.com/ray-project/ray/blob/72755563652ea153c0dc60c95e233f31a4c3082a/python/ray/experimental/named_actors.py#L22. and ray.init(address='auto') from @RobertNishihara.

""" main.py that starts the server """
import time

import ray

@ray.remote
class Counter:
    def __init__(self):
        self.count = 0

    def set_self_handler(self, handler):
        self.handler = handler

    def wait(self):
        if self.count == 0:
            time.sleep(1)
            self.handler.wait.remote()

    def increase(self, n):
        self.count += n

    def get_count(self):
        return self.count

if __name__ == '__main__':
    ray.init()

    counter = Counter.options(name='CounterActor').remote()

    counter.set_self_handler.remote(counter)

    counter.wait.remote()
    while ray.get(counter.get_count.remote()) == 0:
        time.sleep(1)
    print(ray.get(counter.get_count.remote()))
    ray.shutdown()
"""increase.py, started by another python command"""
import ray
from test import Counter

if __name__ == '__main__':
    ray.init(address='auto')   # connect to the server that has been started by main.py

    counter = ray.experimental.get_actor('CounterActor')
    ray.get(counter.increase.remote(1))

When running increase.py, an error message will arise because the server has been shutdown by main.py

2019-12-02 18:20:43,708 ERROR worker.py:939 -- print_logs: Connection closed by server. 2019-12-02 18:20:43,708 ERROR worker.py:1039 -- listen_error_messages_raylet: Connection closed by server. 2019-12-02 18:20:43,709 ERROR import_thread.py:89 -- ImportThread: Connection closed by server.

Maybe
  • 2,129
  • 5
  • 25
  • 45