CUDA cudaMemcpy: invalid argument

Question

Here is my code:

struct S {
    int a, b;
    float c, d;
};
class A {
private:
    S* d;
    S h[3];
public:
    A() {
        cutilSafeCall(cudaMalloc((void**)&d, sizeof(S)*3));
    }
void Init();
};

void A::Init() {
    for (int i=0;i<3;i++) {
        h[i].a = 0;
        h[i].b = 1;
        h[i].c = 2;
        h[i].d = 3;
    }
    cutilSafeCall(cudaMemcpy(d, h, 3*sizeof(S), cudaMemcpyHostToDevice));
}

A a;

In fact it is a complex program which contain CUDA and OpenGL. When I debug this program, it fails when running at cudaMemcpy with the error information

cudaSafeCall() Runtime API error 11: invalid argument.

Actually, this program is transformed from another one that can run correctly. But in that one, I used two variables S* d and S h[3] in the main function instead of in the class. What is more weird is that I implement this class A in a small program, it works fine. And I've updated my driver, error still exists.

Could anyone give me a hint on why this happen and how to solve it. Thanks.

By the way, the cudaMemcpy will reture cudaErrorInvalidValue. — TonyLic, May 14 '12 at 08:31
Where is the line `A a;`? Is it inside a function, or is it global? If it is global, then the constructor might be called before you have a valid device context (I'm not sure about this, but it's possible). If that's the case, then the d pointer passed to cudaMemcpy would be invalid. — harrism, May 29 '12 at 10:48
Can you post your CUDA configuration you use please? I can't reproduce your error with an old GeForce 9400m and CUDA 5.0 preview. — jopasserat, Jun 15 '12 at 12:06

phoad · Answer 1 · 2012-09-02T16:36:36.440

Because the memory operations in CUDA are blocking, they make a synchronization point. So other errors, if not checked with cudaThreadSynchonize, will seem like errors on the memory calls.

So if an error is received on a memory operation, try to place a cudaThreadSynchronize before it and check the result.

Be sure that the first malloc statement is being executed. If it is a problem about initialization of CUDA, like @Harrism indicate, then it would fail in this statement?? Try to place printf statements, and see proper initializations are performed. I think generally invalid argument errors are generated because of using uninitalized memory areas.

Write a printf to your constructor showing the address of the cudaMalloc'ed memory area

A()
{
    d = NULL;
    cutilSafeCall(cudaMalloc((void**)&d, sizeof(S)*3));
    printf("D: %p\n", d);
}

Try to make a memory copy for an area that is locally allocated, namely move the cudaMalloc to above of cudaMemcopy (just for testing).

void A::Init()
{
    for (int i=0;i<3;i++)
    {
        h[i].a = 0;
        h[i].b = 1;
        h[i].c = 2;
        h[i].d = 3;
    }
    cutilSafeCall(cudaMalloc((void**)&d, sizeof(S)*3)); // here!..
    cutilSafeCall(cudaMemcpy(d, h, 3*sizeof(S), cudaMemcpyHostToDevice));
}

Good luck.

CUDA cudaMemcpy: invalid argument

1 Answers1