1

I have some packed vertex data that runs to 6 bytes per vertex:

glVertexAttribPointer Shader.pos3d_loc, 3, GL_UNSIGNED_BYTE, True, 6, 0
glVertexAttribPointer Shader.norm_loc, 3, GL_UNSIGNED_BYTE, True, 6, 3

Are there any performance penalties (eg hidden memory copies) induced by using strides and offsets that aren't a multiple of 4 bytes?

Peeling
  • 356
  • 1
  • 12
  • No copy is done at this point. You either copy the data to the vertex buffer or the values are accessed one by one as they are needed. These values here are only to tell the openGL how to interpret your buffer. For instance normal[i].x = (float)((ubyte)(((void*)ptr)[stride*i])) where ubyte came from GL_UNSIGNED_BYTE parameter. – Matic Oblak Mar 22 '17 at 11:24
  • Thanks :) I understand what the function does, but I know there are a few alignment- and format-related gotchas where GL will silently convert data internally at a substantial performance cost. I wondered if specifying an odd stride and alignment would provoke GL into re-buffering the data internally as the final render call happens. – Peeling Mar 24 '17 at 15:05

1 Answers1

1

The short answer is 'it depends':

On PC and (at least some) Android devices, there is no discernible penalty for tightly packing 3-component attributes in this fashion.

On IOS (as of writing this, anyway) there is a large performance penalty. The IOS GL implementation internally unpacks and aligns any attribute that is not 4-byte aligned, incurring both a memory and CPU hit. The CPU hit might not be noticeable unless you're working with dynamic VBOs.

Peeling
  • 356
  • 1
  • 12