3

I've been spending a lot of time trying to figure out the cause of this problem. The following code attempts to generate a sequence of normally distributed random variables using curand on the device. It seems to generate a few successfully, but then crashes with an "illegal memory address was encountered error". Any help is much appreciated.

main.cu

#include <stdio.h>
#include <cuda.h>
#include <curand_kernel.h>

class A {
public:

    __device__ A(const size_t& seed) {

        printf("\nA()"); 

        curandState state;

        curand_init(seed, 0, 0, &state);

        for(size_t i = 0; i < 1000; ++i)
            printf("\n%f", curand_normal(&state));

    }

    __device__ ~A() { printf("\n~A()"); }
};

/// Kernel
__global__ void kernel(const size_t& seed) {

    printf("\nHello from Kernel...");

    A a(seed);

    return;
}

int main(void) {


    kernel<<<1,1>>>(1);

    cudaError_t cudaerr = cudaDeviceSynchronize();
    if (cudaerr != CUDA_SUCCESS)
        printf("kernel launch failed with error \"%s\".\n",
               cudaGetErrorString(cudaerr));


    return 0;
}

Output

Hello from Kernel...
A()
0.292537
-0.718359
0.958011
0.633711kernel launch failed with error "an illegal memory access was encountered".

I have ran this both on my machine (CUDA 7.0), and a supercomputing cluster (CUDA 6.5), and the same result unfolds.

AlmostSurely
  • 552
  • 9
  • 22

1 Answers1

3

Get rid of the pass-by-reference on the kernel parameter (&).

You are not allowed to write GPU kernels that have pass-by-reference parameters. A GPU kernel cannot modify a host variable. (ignoring Unified Memory, Zero-Copy, and related mechanisms which are not at issue here.)

Community
  • 1
  • 1
Robert Crovella
  • 143,785
  • 11
  • 213
  • 257