When I use Aparapi with an AMD Radeon R7 450 graphics card with older drivers installed, the maximum value of the size parameter in the code below can be 268,435,455. Which corresponds to the 2D image size 16384 X 16384 = 268,435,456 (screenshot below). When I usually use the AMD Radeon RX 5700 XT graphics card, which also has a 2D image size of 16384 X 16384 = 268,435,456, I get this error: Total Local Kernel Size: Exceeds Maximum Allowed Local Kernel Size: 256 failed [ERROR] Failed to execute command That is size value cannot be greater than 256. Same issue with NVIDIA GeForce RTX 3060 Ti even though it has a 2D image size of 32768 X 32768 = 1,073,741,824. Tell me what could be the problem? Why is code performance lower in this case on newer video cards?
Code:
int size = 268435455;
double[] a = new double[size];
double[] b = new double[size];
double[] c = new double[size];
for (int i = 0; i < size; i++)
{
a[i] = i;
b[i] = i;
}
Kernel kernel = new Kernel()
{
@Override
public void run()
{
int gid = getGlobalId();
c[gid] = a[gid] + b[gid];
}
};
kernel.execute(size);
kernel.dispose();
screen GPU Caps Viewer
AMD Radeon R7 450:
AMD Radeon RX 5700 XT:
NVIDIA GeForce RTX 3060 Ti:
ADDITION:
We managed to work around this problem in this way. If you use this code, the size can be greater than 256
Range range = Range.create2D(size, 1);
kernel.execute(range);
kernel.dispose();
But if the array is two-dimensional, as in this case, then for AMD Radeon RX 5700 XT Range.create 2D(size, 1) works with a size greater than 256, but not for NVIDIA GeForce RTX 3060 Ti. For NVIDIA GeForce RTX 3060 Ti still size more than 256 does not work.
Kernel kernel = new Kernel()
{
@Override
public void run()
{
int gid = getGlobalId();
for (int i = 0; i < size; i++)
{
c[i][gid] = a[i][gid] + b[i][gid];
}
}
}
Perhaps, for Aparapi AMD video cards are preferable to NVIDIA. Keep this in mind when buying a video card.