2

I have coded a vector expression template library for the CPU using template meta programming. However, I have difficulty creating GPU kernel for a given expression. Please advise on how I can create a string of the expression (c = a + b) given the expression tree, and the list of parameters to pass as kernel arguments. I have read about the techniques in papers but have difficulty putting it into code. One problem is that I don't know how to store the names of the variables (a,b,c) to be used in the expression. I guess that just giving them them random unique names like x0,x1,x2 might work. A code snippet would be of great help. Thanks

Here are the templates for a kernel, and the actual kernel for c=a+b, taken from "CUDA expression templates" https://pdfs.semanticscholar.org/5d08/a871b72f12a7ee40aeb2a69bca27a23733db.pdf

extern "C" __global__ void kernel ( float∗ a ,
/∗ parameterlist ∗/ , unsigned int size ) {
    idx = blockDim . x ∗ blockIdx . x + threadIdx . x ;
    if ( idx < size ) {
       a [ idx ] = / ∗ evaluation line ∗ / ;
    }
}
Listing 4: The Kernel prototype.

extern "C" __global__ void kernel ( float∗ a ,
   float* b, float* c , unsigned int size ) {
   idx = blockDim . x ∗ blockIdx . x + threadIdx . x ;
   if ( idx < size ) {
      a [ idx ] = b [ idx ] + c [ idx ] ;
   }
}
Listing 5: Kernel generated by compiling Listing 2.
danny
  • 1,101
  • 1
  • 12
  • 34
  • You want the expression to create a separate kernel? Wouldn't it be better (and easier at the same time!) to ask the user to create its own kernel, and use your templates only inside them to produce the device code? – CygnusX1 Jun 15 '17 at 17:31
  • Yes, I want different kernels for an expression. I have edited my post with the template for the expression, and the required kernel for a=b+c. – danny Jun 15 '17 at 17:41
  • hm... you might want to double-check your code... Only Listing 4 is fully visible, and only part of 3&5 (no 1&2 at all) – CygnusX1 Jun 16 '17 at 07:01

0 Answers0