I am trying to compare a simple addition task with both CPU and GPU, but the results that I get are so weird.
First of all, let me explain how I managed to run the GPU task.
Let's dive into code now this is my code it simply
package gpu;
import com.aparapi.Kernel;
import com.aparapi.Range;
public class Try {
public static void main(String[] args) {
final int size = 512;
final float[] a = new float[size];
final float[] b = new float[size];
for (int i = 0; i < size; i++) {
a[i] = (float) (Math.random() * 100);
b[i] = (float) (Math.random() * 100);
}
//##############CPU-TASK########################
long start = System.nanoTime();
final float[] sum = new float[size];
for(int i=0;i<size;i++){
sum[i] = a[i] + b[i];
}
long finish = System.nanoTime();
long timeElapsed = finish - start;
//######################################
//##############GPU-TASK########################
final float[] sum2 = new float[size];
Kernel kernel = new Kernel(){
@Override public void run() {
int gid = getGlobalId();
sum2[gid] = a[gid] + b[gid];
}
};
long start1 = System.nanoTime();
kernel.execute(Range.create(size));
long finish2 = System.nanoTime();
long timeElapsed2 = finish2 - start1;
//##############GPU-TASK########################
System.out.println("cpu"+timeElapsed);
System.out.println("gpu"+timeElapsed2);
kernel.dispose();
}
}
My specs are:
Aparapi is running on an untested OpenCL platform version: OpenCL 3.0 CUDA 11.6.13
Intel Core i7 6850K @ 3.60GHz Broadwell-E/EP 14nm Technology
2047MB NVIDIA GeForce GTX 1060 6GB (ASUStek Computer Inc)
The results that I get are this:
cpu12000
gpu5732829900
My question is why the performance of GPU is so slow. Why does CPU outperform GPU? I expect from GPU to be faster than the CPU does, my calculations are wrong, any way to improve it?