5

I'm working on a fragment shader. It works, but it still needs some optimisation.

As far as I know, most of the cases branches in GLSL are flatten, so both cases are executed. I've eliminated most of the if-else conditions, but there are some of them, which have to stay as they are, because both branches are expensive to execute. I know, that in HLSL there is a [branch] keyword for this problem. But how is it possible to solve it in GLSL?

My code looks like this (the conditions are not uniform, their results depend on calculations in the shader):

if( condition ) {
    expensive calculations...
}

if( condition2 ) {
    expensive calculations...
}

if( condition3 ) {
    expensive calculations...
}

...

One "expensive calculation" can modify the variables, on which a condition will depend. It is possible, that more than one calculation is executed.

I know, that there are older or mobile GPU-s, which does not support branching at all. In that case, there is nothing to do with this issue

Iter Ator
  • 8,226
  • 20
  • 73
  • 164
  • You've already ruled out most of the simple optimisation techniques for your case. I think you have to completely rethink and its going to be tricky. Showing what kind of expensive calculations you are doing might help further. – codetiger Sep 12 '16 at 07:26
  • I cannot avoid branching, if I do something like ray tracing, or ray marching – Iter Ator Sep 12 '16 at 10:21
  • 1
    The way modern GPGPU's work requires them to execute all branches of a given if statement in sequence (possibly causing a group of cores to idle in the meantime). It's therefore a limitation that unfortunately cannot be elimiated. As @codetiger rightfully pointed out, there's probably no easy solution here, but you should try to find one (or an approximation) nonetheless. – Bartvbl Sep 12 '16 at 10:35
  • 1
    If I write: `if( condition ) { ...some code...; return; } ... other calculations ...`, will the calculations be after the return statement also executed? – Iter Ator Sep 12 '16 at 10:41
  • 1
    That depends. Cores are typically organised in groups, which share execution code, but apply their operations on different pieces of data. As such, if only a single core needs to execute the else clause whereas all other course are executing the "if" clause, all cores need to wait until that one particular core is finished. The same is true for return statements. Branches can therefore leave a number of cores just idling, waiting for others to finish. As such you can't expect a speedup by just inserting a return statement. GPUs work in that sense quite differently compared to CPUs. – Bartvbl Sep 12 '16 at 10:51
  • Using a return statement, does not help much (At least in one of my similar cases.) to fix this issue – codetiger Sep 12 '16 at 11:11
  • Afaik the "both conditions are executed and the output is masked" does not apply anymore since modern GPUs *do have* proper branching support. Like the other comments pointed out pixels are processed in parallel within groups(also known as warps), these can be 2x2 pixels or larger, when one core chooses a different (more expensive) branch than the others the group will wait, however if all cores within a group take the faster route there'll be no waiting -> increased performance. – LJᛃ Sep 18 '16 at 01:49

1 Answers1

4

GLSL has no mechanism to enforce branching (or to enforce flattening a branch).

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982