0

I have the following piece of code that needs to optimized (and be later ported to the GPU through SYCL or ArrayFire):

struct Item {
    float value;
    int f;
    float Func(float);
    float Func1(float);
    float Func2(float);
    float Func3(float);
};

float Item::Func(float v) {
    value = v;
    switch(f) {
        case 1: return Func1(v);
        case 2: return Func2(v);
        case 3: return Func3(v);
    }
    return Func1(v);
}

std::vector<Item> items;

AFAIK, on GPUs the function pointer approach is not suitable.

Is there a more performant approach on CPUs and/or GPUs than this one?

Pietro
  • 12,086
  • 26
  • 100
  • 193
  • 1
    Impossible to give a useful answer without seeing what these functions do and what they have in common. –  Jan 20 '22 at 13:47
  • 1
    Would be more readable to use a `default: return Func1`. –  Jan 20 '22 at 13:49
  • What is the problem with the approach on GPUs? Does not compile? Bad performance? – Sebastian Jan 20 '22 at 14:40
  • @Sebastian - I have not implemented it, yet. I just wanted to be sure there are no other better, more efficient approaches. – Pietro Jan 20 '22 at 16:01
  • @YvesDaoust - The functions are all mathematical expressions, with no loops and no branches. – Pietro Jan 20 '22 at 16:04
  • @YvesDaoust - From some compilers I get a warning for a missing return value if I add the default switch option and I remove the final return. I could keep both, of course. – Pietro Jan 20 '22 at 16:07
  • Rather than passing an int and using a switch, you can pass a function pointer instead. I don't know if that is more "GPU" friendly. Are all items getting different `f` ? Is there a pattern ? –  Jan 20 '22 at 16:22
  • 1
    Not sure about SYCL, Cuda started accepting function pointers to kernels as parameters from host side quite early (Cuda 3.2, Compute Capability 2.x). – Sebastian Jan 20 '22 at 17:51

1 Answers1

3

There is a blog post about how to implement an alternative to function pointers using SYCL on this website. The solution uses the template feature and function objects instead. I believe the history of this is that most hardware doesn't support jumping to computed addresses.

Rod Burns
  • 2,104
  • 13
  • 24