6

I have a directory tree

working_dir\
    main.py
my_agent\
    my_worker.py
my_utility\
    my_utils.py

Code in each file is as follows

""" main.py """

import os, sys
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))

from my_agent.my_worker import MyWorker
import ray

ray.init()
workers = [MyWorker.remote(i) for i in range(10)]
ids = [worker.get_id.remote() for worker in workers]
# print(*ids, sep='\n')
print(*ray.get(ids), sep='\n')
""" worker.py """
from my_utility import my_utils
import ray

@ray.remote
class MyWorker():
    def __init__(self, id):
        self.id = id

    def get_id(self):
        return my_utils.f(self.id)
""" my_utils.py """
def f(id):
    return '{}: Everything is fine...'.format(id)

Here's a part of the error message I received

Traceback (most recent call last):

File "/Users/aptx4869/anaconda3/envs/p35/lib/python3.5/site-packages/ray/function_manager.py", line 616, in fetch_and_register_actor unpickled_class = pickle.loads(pickled_class)

File "/Users/aptx4869/anaconda3/envs/p35/lib/python3.5/site-packages/ray/cloudpickle/cloudpickle.py", line 894, in subimport import(name)

ImportError: No module named 'my_utility'

Traceback (most recent call last):

File "main.py", line 12, in print(*ray.get(ids), sep='\n')

File "/Users/aptx4869/anaconda3/envs/p35/lib/python3.5/site-packages/ray/worker.py", line 2377, in get raise value ray.worker.RayTaskError: ray_worker (pid=30025, host=AiMacbook)

Exception: The actor with name MyWorker failed to be imported, and so cannot execute this method

If I remove all statements related to ray, the above code works fine. Therefore, I boldly guess the reason is that ray runs each actor in a new process and sys.path.append only works in the main process. So I add the following code to worker.py

import os, sys
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))

But it still does not work: the same error message shows up. Now I run out of ideas, what should I do?

Maybe
  • 2,129
  • 5
  • 25
  • 45
  • I consider this `os.path.dirname(os.path.dirname(__file__))` better than this `os.path.join(os.path.dirname(__file__), '..')` – spaniard Jan 24 '19 at 01:23
  • @spaniard Thanks:-) I guess you suggested `sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))`. But this does not work either... – Maybe Jan 24 '19 at 02:24
  • Have you tried to remove the line `if __name__ == '__main__' and __package__ is None:`? and simply always add to your path the parent directory? – spaniard Jan 24 '19 at 02:28
  • @spaniard Yes, the same story goes on – Maybe Jan 24 '19 at 02:35
  • @Darkonaut I did what you suggested, still not working. I remembered that python3 has no longer require `__init__.py`, doesn't it? – Maybe Jan 24 '19 at 02:37
  • Yes since Python 3.3 it shouldn't be needed anymore, but my guess was that ray might need it somehow since it's a framework. – Darkonaut Jan 24 '19 at 02:43
  • @Darkonaut Thanks. I've added `__init__.py` to all directory, even the root directory which contains these three folders. It still does not work. – Maybe Jan 24 '19 at 02:46
  • Could be a naming conflict. You have `utils.py` and `worker.py` and I see ray has these files too. Rename your files with a prefix to avoid the clash. – Darkonaut Jan 24 '19 at 02:49
  • @Darkonaut I did what you suggested, and prefix all relative file/class with `my`,,, still not working... – Maybe Jan 24 '19 at 02:58
  • @Darkonaut Yep, I've changed all relative names... The error happens at the `print` statement. IMHO, if the error is caused by name conflict, it should not happen so late. – Maybe Jan 24 '19 at 03:22
  • I cannot speak for ray but at least in standard libs `multiprocessing.Pool` it's necessary that imports for used functions are made for every distributed task _again_. I'm not familiar with ray's internals but that line `ImportError: No module named 'utils'` makes me wonder. It looks like it's trying to import utils from `utils.py` instead of from the directory. – Darkonaut Jan 24 '19 at 03:27
  • @Darkonaut I've updated the code, file names, and the error messages. It can be seen now that the `ImportError` refers to the folder name `my_utility` instead of the `my_utils.py` file. To my best knowledge, I think ray calls `from my_utility import my_utils` in every new actor process, and that causes this problem. – Maybe Jan 24 '19 at 05:09

2 Answers2

10

You are correct about what the issue is.

In your example, you modify sys.path in main.py in order to be able to import my_agent.my_worker and my_utility.my_utils.

However, this path change is not propagated to the worker processes, so if you were to run a remote function like

@ray.remote
def f():
    # Print the PYTHONPATH on the worker process.
    import sys
    print(sys.path)

f.remote()

You would see that sys.path on the worker does not include the parent directory that you added.

The reason that modifying sys.path on the worker (e.g., in the MyWorker constructor) doesn't work is that the MyWorker class definition is pickled and shipped to the workers. Then the worker unpickles it, and the process of unpickling the class definition requires my_utils to be imported, and this fails because the actor constructor hasn't had a chance to run yet.

There are a couple possible solutions here.

  1. Run the script with something like

    PYTHONPATH=$(dirname $(pwd)):$PYTHONPATH python main.py
    

    (from within working_dir/). That should solve the issue because in this case the worker processes are forked from the scheduler process (which is forked from the main Python interpreter when you call ray.init() and so the environment variable will be inherited by the workers (this doesn't happen for sys.path presumably because it is not an environment variable).

  2. It looks like adding the line

    parent_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
    os.environ["PYTHONPATH"] = parent_dir + ":" + os.environ.get("PYTHONPATH", "")
    

    in main.py (before the ray.init() call) also works for the same reason as above.

  3. Consider adding a setup.py and installing your project as a Python package so that it's automatically on the relevant path.

Robert Nishihara
  • 3,276
  • 16
  • 17
3

The new "Runtime Environments" feature, which didn't exist at the time of this post, should help with this issue: https://docs.ray.io/en/latest/handling-dependencies.html#runtime-environments. (See the working_dir and py_modules entries.)