3

I have a pycuda code that can run in a single process. Can python's multiple processes support running this code in multiple subprocesses? If I try, I will find that I made a mistake. Did I make a mistake?

I tried to use python's process to implement a simple multi-process and found that it would go wrong.

    import pycuda.autoinit
    import pycuda.driver as drv
    import numpy
    from pycuda.compiler import SourceModule
    from multiprocessing import Pool, Manager, Process



    def ffunc(i, return_dict, a, b, multiply_them):
        dest = numpy.zeros_like(a)
        multiply_them(
            drv.Out(dest), drv.In(a), drv.In(b),
            block=(400, 1, 1), grid=(1, 1))
        return_dict[i] = dest





    if __name__ == '__main__':
        mod = SourceModule("""
        __global__ void multiply_them(float *dest, float *a, float *b)
        {
         const int i = threadIdx.x;
         dest[i] = a[i] * b[i];
        }
        """)
        multiply_them = mod.get_function("multiply_them")
        aa = numpy.random.randn(2, 400).astype(numpy.float32)
        bb = numpy.random.randn(2, 400).astype(numpy.float32)
        manager = Manager()
        return_dict = manager.dict()
        jobs = []
        for i in range(2):
            p = Process(target=ffunc, args=(i, return_dict, aa[i], bb[i], multiply_them))
            jobs.append(p)
            p.start()
        for p in jobs:
            p.join()
        print(return_dict)

Process Process-2:
Traceback (most recent call last):
  File "/home/vision/anaconda3/envs/py3b/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/home/vision/anaconda3/envs/py3b/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/vision/lpx/AE23D/test_pycuda.py", line 22, in ffunc
    block=(400,1,1), grid=(1,1))
  File "/home/vision/anaconda3/envs/py3b/lib/python3.6/site-packages/pycuda/driver.py", line 382, in function_call
    func._set_block_shape(*block)
pycuda._driver.LogicError: cuFuncSetBlockShape failed: initialization error
Process Process-3:
Traceback (most recent call last):
  File "/home/vision/anaconda3/envs/py3b/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/home/vision/anaconda3/envs/py3b/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/vision/lpx/AE23D/test_pycuda.py", line 22, in ffunc
    block=(400,1,1), grid=(1,1))
  File "/home/vision/anaconda3/envs/py3b/lib/python3.6/site-packages/pycuda/driver.py", line 382, in function_call
    func._set_block_shape(*block)
pycuda._driver.LogicError: cuFuncSetBlockShape failed: initialization error
{}

Process finished with exit code 0

I'm not sure if pycuda can run in different processes. I look forward to your suggestions.

李培鑫
  • 61
  • 4

2 Answers2

3

Fortunately, I solved the problem.

Add a line of code to the main function:

multiprocessing.set_start_method('spawn')
李培鑫
  • 61
  • 4
2

CUDA should not be initialized before a fork.

You can find more details here: https://forums.developer.nvidia.com/t/cuda8-0-bug-child-process-forked-after-cuinit-get-cuda-error-not-initialized-on-cuinit/45764

It's better to spawn a new process; hence, multiprocessing.set_start_method('spawn') works perfectly.

hamnghi
  • 99
  • 1
  • 7