4

TL;DR; - setuptools/distutils script wrapper .exe entry_points do not trigger Windows multiprocessing infinite recursion. Wheel's wrapper .exe entry_points do. How can I get the previous behavior?

Many of us have probably ran into the problem with Python 2.X's multiprocessing module and Windows with infinite recursion when invoking a module script directly.

When I created an entry point for my library that lead to a function with multiprocessing, the entry point script and its wrapping .exe file both were runnable on Windows when I installed them using setuptools, able to directly run a function that invoked multiprocessing.Pool and Process objects. This was great, because my library depended on multiprocessing to speed up the shamefully parallel program.

Things had been working well, until I tried using bdist_wheel to share the library. Though the build process added the .exe wrapper and scripts outwardly before, the .exe was not the same kind of wrapper.

My binary file format skills are far from good, but I knew that there was some sort of translation of zip compressed file to the executable I was seeing, so I went and did the only logical thing and unzipped the .exe file. I used 7zip since even my mingw bash shell didn't have zip/unzip. The .exe wrapper installed from the Wheel file unzipped into a directory containing a simple main.py script,

# -*- coding: utf-8 -*-
import re
import sys

from example_multiprocessing.runner import main

if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
    sys.exit(main())

It was just a python script ran from a zip file. No wonder it blew up.

I re-installed using setup.py, adding back the old wrapper .exe, and tried to unzip that. Lo and behold it is a full on Portable Executable file with .data, .pdata, .rdata, and .text sections extracted into their own files by 7zip. Two hours later, I've dug through distutils and setuptools source code to find out what is going on there, and find that the .exe I am seeing is made by pouring some entry point specific python code into a pre-canned launcher exe made for Windows. Great. I know more than I did before. But if the difference was the .exe wrapper, why does the -script.py entry_point work?

__requires__ = 'example-multiprocessing==1.0.2'
import sys
from pkg_resources import load_entry_point

if __name__ == '__main__':
    sys.exit(
        load_entry_point('example-multiprocessing==1.0.2', 'console_scripts', 'example-mp-run')()
    )

Is load_entry_point special? It is a wrapper around EntryPoint.load(), which is used to explicitly call __import__:

# From [https://bitbucket.org/pypa/pkg_resources/src/33e56f318f5086158de8bb2827acb55db2dbc153/pkg_resources.py?at=default#cl-2258][1]
def load(self, require=True, env=None, installer=None):
    if require: self.require(env, installer)
    entry = __import__(self.module_name, globals(),globals(), ['__name__'])
    for attr in self.attrs:
        try:
            entry = getattr(entry,attr)
        except AttributeError:
            raise ImportError("%r has no %r attribute" % (entry,attr))
    return entry

The globals() look like a dead end from the documentation of __import__, but the ['__name__'] looks like it may be useful. The trouble is that __import__ is a built-in, and it is implemented in C. I've tried to read the mirror posted on GitHub, but it is beyond my ken at the moment, forking into three or more interacting functions, which I can't trace and keep a connection with fromlist. I may also be getting distracted by __import__ since the next block after it tries to access a attribute of the module stored in entry, which I infer to be the entry point function itself.

When I run the following code:

getattr(__import__("example_multiprocessing.runner", globals(), globals(), ["__name__"]), "main")()

I do see the main function I expected to run execute in, and return without issue.

Is __import__ really doing the work I need, setting the __name__ of the script to something not including __main__? Why doesn't runpy, the machinery behind -m work?

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
mobiusklein
  • 1,403
  • 9
  • 12
  • I've already opened this issue against pip at https://github.com/pypa/pip/issues/1891. According to Nick Coghlan multiprocessing and `__main__` on Python 2.7 is fundamentally broken and won't be fixed. Hopefully, pip with wheels will switch to a different launcher technique soon. – schlamar Nov 18 '15 at 08:33
  • Thanks. I really should have done that myself those many months ago. I came to the same conclusion myself, and wrote scripts that worked as entry point wrappers for the impacted facets of my program, explicitly using `__import__` to load the module which used `multiprocessing`. – mobiusklein Nov 18 '15 at 19:45
  • This will be fixed in Python 2.7.11: http://bugs.python.org/issue10128 :) – schlamar Nov 19 '15 at 09:37

0 Answers0