Improve performance by reusing the alpha channel of a RGB texture?

Question

I have a 48bit texture RGB16F.

https://www.khronos.org/registry/OpenGL-Refpages/es3.0/html/glTexImage2D.xhtml states that when using RGB. 1.0 will be put into the alpha channel.

Is 1.0 implicit or actually stored?

And in the latter case. My main question: If i put my 16bit heightmap into the alpha channel, so it becomes RGBA16F. Will I improve performance?

All insights are welcome.

score 0 · Answer 1 · answered Mar 13 '17 at 18:55

Is 1.0 implicit or actually stored?

That's implementation specific. If you were asking about 888 vs 8888 textures, I'd tell you that pretty much every implementation is bound to use 32 bits per texel, but I'm not so sure for 16F formats. It is telling that Metal doesn't define an RGB16F format (link) which strongly suggests that PowerVR GPUs at least will pad the format. Vulkan does define RGB16F, but while the spec requires support for R16F, RG16F and RGBA16F it doesn't require support for RGB16F (link), again suggesting lack of native support by some vendors. I wouldn't be surprised if some GPU somewhere does support RGB16F, but I suspect most would just pad. For a more definitive answer you might need to post questions on the GPU forums or experiment by examining memory usage in some controlled conditions.

And in the latter case. My main question: If i put my 16bit heightmap into the alpha channel, so it becomes RGBA16F. Will I improve performance?

Are you sampling it at the same time (i.e. from the same shader, with the same UVs)? If so, then yes absolutely it will be a better choice than using an RGB16F plus a R16F. If they're not sampled together (e.g. the heightmap is sampled in the vertex shader, the colour in the fragment shader), then it's harder to guess. Probably you'd be harming performance on the heightmap fetch (those extra bytes blowing the cache), but leaving the colour fetch unharmed (there was padding there anyway) - overall you'd lose some performance but save some memory - any performance loss is probably pretty minor and if your bottleneck lies elsewhere it may not do any harm at all.

If you do have different temporal usage (e.g. vertex shader using height field, fragment shader using color) then don't pack them. The performance loss may well be minor, but you'll use a lot more bandwidth from memory (effectively loading the entire texture twice on a tile-based architecture), which is a great way to kill battery life. — solidpixel, Mar 14 '17 at 09:39

score 0 · Answer 2 · answered Mar 14 '17 at 09:38

Is 1.0 implicit or actually stored?

I suspect "both", although perhaps not in the way you mean.

Most GPU samplers support implicit rules for missing channels (0.0 for color, 1.0 for alpha), and using these is lower power than sampling / filtering from memory, so I would expect this to use implict loads for the missing channels.

However, hardware is also usually allergic to loading things which are not a power of two in size (things which span cache line boundaries typically take two cycles to load on most cache architectures), so I would also expect each texel to be padded out to 64-bits each. What the 16-bits of padding contains may not be 1.0, as the hardware doesn't care because it's using implicit rules.

Improve performance by reusing the alpha channel of a RGB texture?

2 Answers2