3

Here is my code:

struct S {
    int a, b;
    float c, d;
};
class A {
private:
    S* d;
    S h[3];
public:
    A() {
        cutilSafeCall(cudaMalloc((void**)&d, sizeof(S)*3));
    }
void Init();
};

void A::Init() {
    for (int i=0;i<3;i++) {
        h[i].a = 0;
        h[i].b = 1;
        h[i].c = 2;
        h[i].d = 3;
    }
    cutilSafeCall(cudaMemcpy(d, h, 3*sizeof(S), cudaMemcpyHostToDevice));
}

A a;

In fact it is a complex program which contain CUDA and OpenGL. When I debug this program, it fails when running at cudaMemcpy with the error information

cudaSafeCall() Runtime API error 11: invalid argument.

Actually, this program is transformed from another one that can run correctly. But in that one, I used two variables S* d and S h[3] in the main function instead of in the class. What is more weird is that I implement this class A in a small program, it works fine. And I've updated my driver, error still exists.

Could anyone give me a hint on why this happen and how to solve it. Thanks.

einpoklum
  • 118,144
  • 57
  • 340
  • 684
TonyLic
  • 647
  • 1
  • 13
  • 26
  • By the way, the cudaMemcpy will reture cudaErrorInvalidValue. – TonyLic May 14 '12 at 08:31
  • 2
    Where is the line `A a;`? Is it inside a function, or is it global? If it is global, then the constructor might be called before you have a valid device context (I'm not sure about this, but it's possible). If that's the case, then the d pointer passed to cudaMemcpy would be invalid. – harrism May 29 '12 at 10:48
  • Can you post your CUDA configuration you use please? I can't reproduce your error with an old GeForce 9400m and CUDA 5.0 preview. – jopasserat Jun 15 '12 at 12:06

1 Answers1

5

Because the memory operations in CUDA are blocking, they make a synchronization point. So other errors, if not checked with cudaThreadSynchonize, will seem like errors on the memory calls.

So if an error is received on a memory operation, try to place a cudaThreadSynchronize before it and check the result.


Be sure that the first malloc statement is being executed. If it is a problem about initialization of CUDA, like @Harrism indicate, then it would fail in this statement?? Try to place printf statements, and see proper initializations are performed. I think generally invalid argument errors are generated because of using uninitalized memory areas.

  1. Write a printf to your constructor showing the address of the cudaMalloc'ed memory area

    A()
    {
        d = NULL;
        cutilSafeCall(cudaMalloc((void**)&d, sizeof(S)*3));
        printf("D: %p\n", d);
    }
    
  2. Try to make a memory copy for an area that is locally allocated, namely move the cudaMalloc to above of cudaMemcopy (just for testing).

    void A::Init()
    {
        for (int i=0;i<3;i++)
        {
            h[i].a = 0;
            h[i].b = 1;
            h[i].c = 2;
            h[i].d = 3;
        }
        cutilSafeCall(cudaMalloc((void**)&d, sizeof(S)*3)); // here!..
        cutilSafeCall(cudaMemcpy(d, h, 3*sizeof(S), cudaMemcpyHostToDevice));
    }
    

Good luck.

phoad
  • 1,801
  • 2
  • 20
  • 31