I try to use cupy to perform task on GPU - here is a code:
# on CPU
x_cpu = np.array([1, 2, 3])
%timeit l2_cpu = np.linalg.norm(x_cpu)
# on GPU
x_gpu = cp.array([1, 2, 3])
%timeit l2_gpu = cp.linalg.norm(x_gpu)
here is the output:
4 µs ± 18 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
48.7 µs ± 86.4 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) ```
Question:
My question is - why in my case cupy works slowly than NumPy? I expected that the CuPy will work quicker than NumPy. What did I do wrong and maybe somebody can advise me how to fix it?
Environment:
- OS: Ubuntu 20.04
- Video:
> nvidia-smi
Wed Sep 15 22:11:36 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A |
| 41% 33C P8 1W / 260W | 184MiB / 11019MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 627367 C ...conda3/envs/t1/bin/python 181MiB |
+-----------------------------------------------------------------------------+
Also, I use puthon3.8 and I have installed:
- cupy 8.3.0
- cupy-cuda114 9.4.0
- cudatoolkit 10.1.243 h6bb024c_0
and so on.
UPDATED
I used array with 1023272 items also - here is a result:
- 175 µs ± 10.2 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
- 579 µs ± 97.1 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Also, I checked GPU utilization using nvidia-smi and I can confirm - GPU was involved in calculation