I have a pretty simple pycuda script here that's supposed to load in a grayscale image of a truck (2048x1365), invert the colors, and save it back;
import pycuda.autoinit
import pycuda.driver as device
from pycuda.compiler import SourceModule as cpp
import numpy as np
import cv2
modify_image = cpp("""
__global__ void modify_image(int pixelcount, int* inputimage, int* outputimage)
{
int id = threadIdx.x + blockIdx.x * blockDim.x;
if (id >= pixelcount)
return;
outputimage[id] = 255 - inputimage[id];
}
""").get_function("modify_image")
print("Loading image")
image = cv2.imread("truck.jpg", cv2.IMREAD_GRAYSCALE)
print("Processing image")
pixels = image.shape[0] * image.shape[1]
output = np.zeros_like(image)
modify_image(
device.In(np.int32(pixels)),
device.In(image),
device.Out(output),
block=(1024,1,1), grid=(pixels // 1024, 1))
print("Saving image")
cv2.imwrite("processed.png", output)
print("Done")
However, when trying to run it, I get a cuCtxSynchronize Error
during execution. I did some small modification to the code and futher testing (https://hastebin.com/fucugucawe.py), and found out that it works fine up to a certain level, but trying to allocate any more than around 420,000 bytes causes the error to occur.
I used MSI Afterburner to monitor my gpu's memory usage, and it never goes above 1GB, even when running the program. My GPU is a GTX 980 with 4GB of VRAM, so I shouldn't even be close to it's limit. If anybody knows what's going on here, I would really appreciate the help.
Here is the exact output produced by the program;
Loading image
Processing image
Traceback (most recent call last):
File "<path to source file>\imfiltertest.py", line 36, in <module>
block=(1024,1,1), grid=(pixels // 1024, 1))
File "C:\Users\<me>\AppData\Local\Programs\Python\Python36\lib\site-packages\pycuda\driver.py", line 405, in function_call
Context.synchronize()
pycuda._driver.LaunchError: cuCtxSynchronize failed: unspecified launch failure
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: unspecified launch failure
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: unspecified launch failure
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: unspecified launch failure
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: unspecified launch failure
[Finished in 0.7s]
I've already attempted disabling window's TDR, but it had no effect on the issue.