0

One of my fragment shaders emulates some of the basic OpenGL ES 1.1 multi-texturing features.

As part of that, I have the following GLSL function declared:

void applyTexture(int tuIdx) {
    lowp vec4 texColor = texture2D(s_cc3Textures[tuIdx], v_texCoord[tuIdx]);
    int tuMode = u_cc3TextureUnitMode[tuIdx];

    if (tuMode == k_GL_COMBINE)
        combineTexture(texColor, tuIdx);
    else if (tuMode == k_GL_MODULATE)
        fragColor *= texColor;
    else if (tuMode == k_GL_DECAL)
        fragColor.rgb = (texColor.rgb * texColor.a) + (fragColor.rgb * (1.0 - texColor.a));
    else if (tuMode == k_GL_REPLACE)
        fragColor = texColor;
    else if (tuMode == k_GL_ADD) {
        fragColor.rgb += texColor.rgb;
        fragColor.a *= texColor.a;
    }
    else if (tuMode == k_GL_BLEND) {
        fragColor.rgb =  (fragColor.rgb * (1.0 - texColor.rgb)) + (u_cc3TextureUnitColor[tuIdx].rgb * texColor.rgb);
        fragColor.a *= texColor.a;
    }
}

The k_GL_XXX constants are #defined at the top of the shader.

To perform multi-texturing, this function is called multiple times. For the SGX GPU's under iOS, this is successfully accomplished using a for loop:

for (int tuIdx = 0; tuIdx < MAX_TEXTURES; tuIdx++) {
    if (tuIdx == u_cc3TextureCount) return;     // Break out once we've applied all the textures
    applyTexture(tuIdx);
}

where u_cc3TextureCount is an int uniform and MAX_TEXTURES is #defined to be 2.

Strangely, attempting to manually unroll this loop does not work, and results in nothing being drawn:

if (u_cc3TextureCount > 0) {
    applyTexture(0);
}
if (u_cc3TextureCount > 1) {
    applyTexture(1);
}

Even more strangely, neither does the even more basic:

applyTexture(0);
applyTexture(1);

which is completely unexpected behaviour, to say the least !

I have confirmed that the value of u_cc3TextureCount is 2, so all three of these approaches should yield the same results.

The reason I am attempting to unroll the loop is that several GPUs, including some Android GPUs, and the new Apple A7 GPU in the iPad Air do not work correctly with the loop. I'm trying to find a way of calling the applyTexture() function multiple times that will work for all GPUs.

Is anyone able to explain to me why such basic loop unrolling is not working?

Bill Hollings
  • 2,344
  • 17
  • 25
  • 1
    That's a horribly expensive shader, with all those conditionals and that loop. Is there a way to break that out into separate shaders for each of the blend modes, then apply them as needed for the textures individually? The added overhead of separate draw calls and shader switching should be small compared to the savings in shader speed. – Brad Larson Nov 25 '13 at 23:43
  • Conditionals in shader code cause dramatic reductions in rendering speed because embedded GPUs have no branch prediction capabilities. You may have even stumbled onto a GLSL compiler bug because this is just not usually done. – ClayMontgomery Nov 26 '13 at 20:22
  • Yes, I'm well aware that its not a performant shader. It is not meant to be a production shader. It is an intentionally generic shader that gets applied by default in the framework, when a developer has not yet supplied their own shader (which they should). Nevertheless, regardless of what's inside applyTexture(), I'm trying to solve the issue of why invoking it in an unrolled loop is not working. – Bill Hollings Nov 27 '13 at 14:16
  • Did you eventually figure out what the bug was when calling a function multiple times? I experience same issue on iOS devices...? – Nicolas Nov 27 '17 at 09:03

0 Answers0