I am playing around with the differences between numpy and cupy and have noticed that within these two similiar programs I have created, the cupy version is much slower despite the fact that is runs on a GPU.
Here is the numpy version:
import time
import numpy as np
size = 5000
upperBound = 20
dataSet = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
dataLength = np.random.randint(0, high=upperBound, size=size, dtype='l')
randomNumber = np.random.randint(0, high=62, size=size * upperBound, dtype='l')
count = 0
dataCount = 0
start_time = time.time()
for i in range(size):
lineData = ""
for j in range(dataLength[i]):
lineData = lineData + dataSet[randomNumber[count]]
count = count + 1
print(lineData)
dataCount = dataCount + 1
time = str(time.time() - start_time)
print("------------------------\n" + "It took this many sedonds: " + time)
print("There were " + str(dataCount) + " many data generations.")
Here is the cupy version:
import time
import cupy as cp
size = 5000
upperBound = 20
dataSet = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
dataLength = cp.random.randint(0, high=upperBound, size= size,dtype='l')
randomNumber = cp.random.randint(0, high=62, size= upperBound * size,dtype='l')
count = 0
dataCount = 0
start_time = time.time()
for i in range(size):
lineData = ""
for j in range(int(dataLength[i])):
lineData = lineData + str(dataSet[int(randomNumber[count])])
count = count + 1
print(lineData)
dataCount = dataCount + 1
time = str(time.time() - start_time)
print("-------------------\n" +"It took this many seconds: " + time)
print("There were " + str(dataCount) + " many data generations.")
They are essentially the same code except for the fact that one is using numpy and the other is using cupy. I was expecting cupy to execute faster due to the GPU ussage, but that was not the case. The run time for numpy was: 0.032. While the run time for cupy was: 0.484.