0

We have a problem making python code, that is used by a process hosted on a network drive, run on multiple machines. I thought virtualenv would be a good way to solve this problem, but the additional complication of sharing an installation makes this a challenge.

Setup:

Machine H:
  - hosts network drive D
  - doesn't execute anything
Machine A:
  - has python installed
  - has created the virtualenv on D
Machine B:
  - may have python installed, but no additional packages

Drive D:

project/
  |- SomeProcess.kjb
  |- venv/
    |- Scripts/
      |- activate.bat
    |- [...]
  |- python_package/
    |- entry_point.py
    |- [...]

On drive D we have an installation of a process that is executed by different machines and, as part of the execution, calls some python code. Specifically this a Kettle job, but I don't think this matters. The point is that during the process it calls the python code by running

/D/project/venv/Scripts/activate.bat
python /D/project/python_package/entry_point.py

in a shell-like environment. The job itself is run by a program that must be installed and executed on the calling machine. This works IFF the calling machine is also the one that created the virtualenv on the drive. If another machine tries to run the process, it stops with errors related to not using the correct python installation such as:

Traceback (most recent call last):
  File "D:\project\venv\lib\site.py", line 761, in <module>
    main()
  File "D:\project\venv\lib\site.py", line 738, in main
    paths_in_sys = addsitepackages(paths_in_sys)
  File "D:\project\venv\lib\site.py", line 271, in addsitepackages
    addsitedir(sitedir, known_paths)
  File "D:\project\venv\lib\site.py", line 202, in addsitedir
    addpackage(sitedir, name, known_paths)
  File "D:\project\venv\lib\site.py", line 170, in addpackage
    exec(line)
  File "<string>", line 1, in <module>
  File "D:\project\venv\lib\importlib\util.py", line 14, in <module>
    from contextlib import contextmanager
ModuleNotFoundError: No module named 'contextlib'

The alternative of not using a virtualenv is not desirable as it requires the calling machines to have the required python packages installed globally.

A possible fix is to install a virtualenv for each calling machine and switching to the correct one on execution of the process. However we currently have no idea how to get the process to recognize the calling machine and select the right virtualenv.

Is this a problem that can be solved with virtualenvs? Maybe we should look into finding alternative workflows that don't have this problem.

Etienne Ott
  • 656
  • 5
  • 14
  • 1
    you can invest some time to start using docker. You can start from here - https://docs.docker.com/samples/library/python/ or find some tutorials online, there are lots of them, for instance - https://djangostars.com/blog/what-is-docker-and-how-to-use-it-with-python/ – Alexandr Zayets Aug 14 '19 at 12:08
  • "it stops with errors related to not using the correct python installation" don't paraphrase errors, just copy/paste them. – Jean-Paul Calderone Aug 14 '19 at 12:14

1 Answers1

2

The first consideration is: There should be a REALLY REALLY GOOD reason to store code in a shared drive without a VCS. If you don't have it, you might start using GIT.

Even if you have that really good reason: sharing virtualenvs is not a good idea. Instead I will give you two alternate workflows in order to solve your problem:

  1. Use docker if you want to keep an isolated workspace shared across many computers. Using containers is a perfect solution for what you are mentioning (https://runnable.com/docker/python/dockerize-your-python-application)
  2. Otherwise, if you just need to share the dependencies between projects you can use requirements.txt. In this case the base drive won't contain the libraries, but once the code is replicated you should be able to install them with pip.
Pablo Martinez
  • 449
  • 2
  • 6
  • The project folder is in source control, but execution of the process depends on configuration files and artifacts, which is why we want different users (on different machines) to be able to execute the same installation of the process. Okay, thanks for the answer. It seems like docker is the way to go. We dismissed docker as a solution for a completely different problem, but it seems this has biased me against using it where it is the perfect solution. – Etienne Ott Aug 14 '19 at 12:32