4

I'm writing code that is ideally run on the GPU for speed, using CuPy. However, I want the code to be able run (albeit more slowly) using a numpy implementation. At the moment I am doing the following:

import numpy as np
if gpu_present:
    import cupy as cp
else:
    import numpy as cp

I am worried that I might run into problems later on. Is this good practice?

TomNorway
  • 2,584
  • 1
  • 19
  • 26

1 Answers1

2

When the script is small and the namespace to use can be fixed at the startup, I often use a global variable named xp (same as your solution). A similar pattern that I also sometimes use is to make it an instance attribute of a class (again named xp); it is more tolerable for future extensions because each instance can have a distinct value for that attribute. A similar, much more robust, but cumbersome approach is to make xp the first argument of every function.

When writing a library that may be used in any circumstances (e.g. multithreaded code, using both NumPy and CuPy in a single process), it is better to make each function/class handle the namespace appropriately for the arguments. I often use get_array_module utility for that purpose. CuPy has this function, though it requires CuPy to be installed. Chainer also has it. It is also simple to write by yourself. Using this utility, you can make the code usable with either NumPy or CuPy arrays without a global switch.

Also note that NumPy>=1.17 can dispatch CuPy arrays to appropriate CuPy routines, so you can pass CuPy arrays directly to numpy.* functions in most cases. If your code only does computation on given arrays, you even do not need to use cupy namespace at all (you still need to use it for creating a new array without giving another one, like cupy.ones and cupy.random.*).

Seiya Tokui
  • 341
  • 2
  • 3