1

I wish to make a call from one GPU kernel to another:

import numpy
from numbapro import vectorize

sig = 'int16(int16, int16)' 

@vectorize([sig], device=True, target='gpu')
def sum(a, b):
    return a + b

@vectorize([sig], target='gpu')
def proxy(a, b):
    return sum(a, b)

result = proxy(5, 10) # this will fail!

I've added the device=True on the called function, but it doesn't seem to do the trick.

The failing line yields this error: TypingError: Untyped global name 'sum'

What may be wrong?

  • As per [here](http://docs.continuum.io/numbapro/CUDAufunc.html), you probably need to explicitly identify `sum` as a CUDA device ufunc with a different decorator in order for the compiler to understand what is going on – talonmies Jul 02 '15 at 06:45

0 Answers0