I'm trying to debug a source code that works fine and gives no errors or warnings when compiling. The problem is that when I run it with cuda-gdb step by step, no CUDA kernels are launched at all (the output I get from the debugger is totally different from the one shown in the Nvidia cuda-gdb guide), but the program still works without any errors. At all times I get No CUDA kernels, devices or threads. Apparently Focus is not set on anything too. I'm using the 4.2 release of CUDA-GDB.
This is what I get from the debugger when it should launch the kernel:
Breakpoint 1, matrixMulGPU (M=0x609160, N=0x609270, P=0x609490, Width=8)
at matrixMul1.cu:141
141 MatrixMulKernel<<<dimGrid, dimBlock>>>(Md, Nd, Pd, Width);
(cuda-gdb) step
MatrixMulKernel (__cuda_0=0x210000, __cuda_1=0x210100, __cuda_2=0x210200,
__cuda_3=8) at matrixMul1.cu:103
103 __global__ void MatrixMulKernel(float *Md, float *Nd, float *Pd, int Width){
(cuda-gdb) step
__device_stub__Z15MatrixMulKernelPfS_S_i (__par0=0x210000, __par1=0x210100,
__par2=0x210200, __par3=8)
at tmpxft_000016d4_00000000-1_matrixMul1.cudafe1.stub.c:5
5 tmpxft_000016d4_00000000-1_matrixMul1.cudafe1.stub.c: Arquivo ou diretório não encontrado.
in tmpxft_000016d4_00000000-1_matrixMul1.cudafe1.stub.c
(cuda-gdb) step
cudaLaunch<char> (
entry=0x4011ea "UH\211\345SH\203\354(H\211}\350H\211u\340H\211U؉MԋM\324H\213U\330H\213]\340H\213E\350H\211\336H\211\307\350\024\377\377\377H\203\304([\311\303UH\211\345SH\203\354(\277Pn@") at cuda_runtime.h:958
958 return cudaLaunch((const char*)entry);
(cuda-gdb) step
959 }
(cuda-gdb) step
MatrixMulKernel (__cuda_0=0x210000, __cuda_1=0x210100, __cuda_2=0x210200,
__cuda_3=8) at matrixMul1.cu:121
121 }
My CUDA device is a GeForce 8400M GS and I had no problems with the deviceQuery check. I've no clue about how to solve this as the Nvidia forum is offline these days!
Thanks a lot in advance.