2

The following code works perfectly:

from multiprocessing import Pool
import time

values = list(range(10))

def print_time_and_value(value):
    print(time.time(), value)

if __name__ == '__main__':

    p = Pool(4)
    p.map(print_time_and_value, values)

but when I change the "multiprocessing" import to the "multiprocess" library:

from multiprocess import Pool

it raises the following error during execution:

Traceback (most recent call last):
  File "test.py", line 13, in <module>
    p.map(print_time_and_value, values)
  File "C:\Users\User\Anaconda3_64bits\lib\site-packages\multiprocess\pool.py", line 268, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "C:\Users\User\Anaconda3_64bits\lib\site-packages\multiprocess\pool.py", line 657, in get
    raise self._value
NameError: name 'time' is not defined

I can't use multiprocessing since I have to use unpicklable objects later on my main application, so I have to use multiprocess, which has dill serialization.

I have noticed that putting the "time" import inside the "print_time_and_value" function instead of the global scope solves this issue, but this behavior is a bit weird. As it is a fork of multiprocessing, I had guessed it would work the same way.

I'm using Python 3.7.0, multiprocess module is version 0.70.7; running on 64 bits Anaconda environment, Windows 10.

Rodrigo Bonadia
  • 125
  • 2
  • 8
  • I'm not familiar with `multiprocess`, but it sounds like it isn't properly serializing (or maybe deserializing) `print_time_and_value`. – chepner Apr 29 '19 at 15:00

2 Answers2

1

You should import the packages inside the function.

Example:

def print_time_and_value(value):
    import time
    print(time.time(), value)

Moreover, any variables defined outside the functions cannot be used. You can only use the variables defined inside the function.

For example:

y = 5
def print_add(x):
    print(x+y)

This would give you an error

Instead, you will have to do this,

def print_add(x):
    y = 5
    print(x+y)
Jay
  • 67
  • 1
  • 3
0

I'm the multiprocess author. I see you are on Windows... when you run on Windows, I suggest you use freeze_support. I believe that should resolve the NameError you are seeing.

import multiprocess as mp
import time

values = list(range(10))

def print_time_and_value(value):
    print(time.time(), value)


if __name__ == '__main__':

    mp.freeze_support()  # needed for Windows
    p = mp.Pool(4)
    p.map(print_time_and_value, values)

With multiprocess your code should even work in the interpreter:

Python 3.7.3 (default, Mar 30 2019, 05:40:15) 
[Clang 9.0.0 (clang-900.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 
>>> import multiprocess as mp
>>> import time
>>> values = list(range(10))
>>> def print_time_and_value(value):
...     print(time.time(), value)
... 
>>> p = mp.Pool(4)
>>> _ = p.map(print_time_and_value, values)
1556681189.844021 0
1556681189.8443708 1
1556681189.8446798 2
1556681189.845576 4
1556681189.84569 5
1556681189.8458931 3
1556681189.846055 6
1556681189.846396 7
1556681189.846845 8
1556681189.847295 9
>>> 

Note that I typically would include import time inside the function, as it makes the serialization easier.

Mike McKerns
  • 33,715
  • 8
  • 119
  • 139