I have a routine that is designed to be called under any of three processing modes; SingleCpuThread, ParallelCpuThreads and ParallelGpuThreads.
Within the routine, the math is performed using Alea.DeviceFunction in order to be compliant with Alea when the routine is called under the ParallelGpuProcessing mode.
Question: When the same routine is called under the other two modes, and the math is being performed using DeviceFunction, is that using the Gpu and incurring the overhead and marshaling, etc.? And if so (which would be bad), what's the best way to let the same routine use dot net's .Math functions instead of .DeviceFunction, without having duplicate the whole routine for separate Cpu-happy and a Gpu-happy versions of the routine?