Can't get simple CUDA program to work

Question

I'm trying the "hello world" program of CUDA programming: adding two vectors together. Here's the program I have tried:

#include <cuda.h>
#include <stdio.h> 
#define  SIZE 10

__global__  void vecAdd(float* A, float* B, float* C) 
{ 
   int i = threadIdx.x; 
   C[i] = A[i] + B[i]; 
} 

int  main() 
{ 
     float A[SIZE], B[SIZE], C[SIZE]; 
     float *devPtrA, *devPtrB, *devPtrC; 
     size_t memsize= SIZE * sizeof(float); 

     for (int i=0; i< SIZE; i++) {
        A[i] = i;
        B[i] = i;
     }

     cudaMalloc(&devPtrA, memsize); 
     cudaMalloc(&devPtrB, memsize); 
     cudaMalloc(&devPtrC, memsize); 
     cudaMemcpy(devPtrA, A, memsize,  cudaMemcpyHostToDevice); 
     cudaMemcpy(devPtrB, B, memsize,  cudaMemcpyHostToDevice); 

     vecAdd<<<1, SIZE>>>(devPtrA,  devPtrB, devPtrC); 
     cudaMemcpy(C, devPtrC, memsize,  cudaMemcpyDeviceToHost); 

     for (int i=0; i<SIZE; i++) 
         printf("C[%d]: %f + %f => %f\n",i,A[i],B[i],C[i]); 

     cudaFree(devPtrA); 
     cudaFree(devPtrB); 
     cudaFree(devPtrC); 
}

Compiled with:

nvcc cuda.cu

Output is this:

C[0]: 0.000000 + 0.000000 => 0.000000
C[1]: 1.000000 + 1.000000 => 0.000000
C[2]: 2.000000 + 2.000000 => 0.000000
C[3]: 3.000000 + 3.000000 => 0.000000
C[4]: 4.000000 + 4.000000 => 0.000000
C[5]: 5.000000 + 5.000000 => 0.000000
C[6]: 6.000000 + 6.000000 => 0.000000
C[7]: 7.000000 + 7.000000 => 0.000000
C[8]: 8.000000 + 8.000000 => 366987238703104.000000
C[9]: 9.000000 + 9.000000 => 0.000000

Every time I run it, I get a different answer for C[8], but the results for all the other elements are always 0.000000.

The Ubuntu 11.04 system a 64-bit Xeon server with 4 cores running the latest NVIDIA drivers (downloaded on Oct 4, 2012). The card is an EVGA GeForce GT 430 with 96 cores and 1GB of RAM.

What should I do to figure out what's going on?

did you install cuda sdk from [here](http://developer.nvidia.com/cuda/cuda-downloads)? (not the toolkit) — none, Oct 05 '12 at 21:12
My guess is CUDA fails to initialize. I recommended adding error checking to each and every CUDA API call. — njuffa, Oct 05 '12 at 21:13
@gokcehan I downloaded the driver, toolkit, and SDK from that website. I'm not sure what to do with the SDK, though. It appears to contain mostly documentation and sample code. — Barry Brown, Oct 05 '12 at 21:20
As I remember toolkit contains the compiler (nvcc) whereas SDK has the library. I remember having a similar problem because I didn't installed one of them. The strange thing is that it doesn't give you an error when you try to run without installing SDK. — none, Oct 05 '12 at 21:27
The SDK is not needed to compile and run CUDA code. If you use something from the SDK, such as cutil from some of the SDK's, then of course you need it. But your code doesn't appear to have any dependency on the SDK. Your toolkit installation is probably OK since you can compile with nvcc. Which leaves the GPU and driver. the comment from @njuffa is definitely recommended and is good practice always. You might also run nvidia-smi -a from a linux command line to see if the GPU is properly available. — Robert Crovella, Oct 05 '12 at 22:20
When I compile your code with nvcc cuda.c as you state, I get a number of errors. The NVCC users manual suggests that cuda source files containing cuda device code should be in files with a .cu extension. Do you get any different results if you name your file cuda.cu and compile with nvcc cuda.cu? — Robert Crovella, Oct 05 '12 at 22:22
@RobertCrovella You're right. My file ends in .cu, as it's supposed to. I didn't copy-and-paste correctly into SO. — Barry Brown, Oct 05 '12 at 23:52
CUDA functions have a return value `cudaError_t`. You should see if you get any errors (I suspect you do). — tskuzzy, Oct 06 '12 at 00:01
I ran `nvidia-smi -a` and it said the drivers weren't loaded. I re-installed the drivers and now the program works. Yay! But then I rebooted the machine and the drivers aren't loading. So now the question boils down to this: how do I load the drivers on a system that is not running X? — Barry Brown, Oct 06 '12 at 00:05
There's a few ways to skin the cat. You can sudo nvidia-smi -a, and it will load the drivers. But you'll lose them on reboot. For a more permanent solution, something like [this](http://www.resultsovercoffee.com/2011/01/cuda-in-runlevel-3.html) is appropriate. — Robert Crovella, Oct 06 '12 at 01:47
Did you try running deviceQuery? That essentially confirms that everything from driver to SDK is installed and you are ready to run .cu codes on GPU. Also, dont you think, you should cast the device pointers in `cudaMalloc` as `void**`? — Recker, Oct 06 '12 at 07:13
@abhinole I've seen some sample programs that cast to (void**) and some that don't. None of the samples in the C Programming Guide do the cast and the compiler doesn't complain. — Barry Brown, Oct 08 '12 at 05:47
@RobertCrovella Thanks for the link to the web page about forcing the drivers to load! — Barry Brown, Oct 08 '12 at 05:48

Seçkin Savaşçı · Answer 1 · 2012-11-11T00:10:51.767

It seems that your drivers are not initialized, but not checking the cuda return codes is always a bad practice, you should avoid that. Here is simple function + Macro that you can use for cuda calls(quoted from Cuda by Example):

static void HandleError( cudaError_t err,
                         const char *file,
                         int line ) {
    if (err != cudaSuccess) {
        printf( "%s in %s at line %d\n", cudaGetErrorString( err ),
                file, line );
        exit( EXIT_FAILURE );
    }
}
#define HANDLE_ERROR( err ) (HandleError( err, __FILE__, __LINE__ ))

Now start calling your functions like:

HANDLE_ERROR(cudaMemcpy(...));

score 1 · Accepted Answer · answered Oct 09 '12 at 04:50

1

Most likely cause: the NVIDIA drivers weren't loaded. On a headless Linux system, X Windows isn't running, so the drivers aren't loaded at boot time.

Run nvidia-smi -a as root to load them and get a confirmation in the form of a report.

Although the drivers are now loaded, they still need to be initialized every time a CUDA program is run. Put the drivers into persistent mode with nvidia-smi -pm 1 so they remain initialized all the time. Add this to a boot script (such as rc.local) so it happens at every boot.

answered Oct 09 '12 at 04:50

Barry Brown

20,233
15
69
105

It is worth pointing out that the solution to this problem has been explicitly covered in the Linux release notes and/or the Linux getting started PDF pretty much forever. – talonmies Oct 09 '12 at 05:46
2

That's good to know. If only the installer for the NVIDIA drivers had pointed me there. Instead, it said "see the documentation for your vendor distribution" and Ubuntu's docs assume everyone is running a GUI. – Barry Brown Oct 09 '12 at 19:13

Can't get simple CUDA program to work

2 Answers2