I have a cuda code which performs calculation on GPU. I am using clock(); to find out timings
My code structure is
__global__ static void sum(){
// calculates sum
}
extern "C"
int run_kernel(int array[],int nelements){
clock_t start, end;
start = clock();
//perform operation on gpu - call sum
end = clock();
double elapsed_time = ((double) (end - start)) / CLOCKS_PER_SEC;
printf("time required : %lf", elapsed_time);
}
But the time is always 0.0000 I checked printing start and end time. Start has some value but end time is always zero.
Any idea what might be the cause? Any alternatives to measure time.
Any help would be appreciated.
Thanks