Vertex batches (geometry groups) and maximum VBO (vertex buffer) size

Question

I did a lot of researches concerning the way to gather vertex data into groups commonly called batches.

Here's for me the 2 main interesting articles on the subject:

https://www.opengl.org/wiki/Vertex_Specification_Best_Practices

http://www.gamedev.net/page/resources/_/technical/opengl/opengl-batch-rendering-r3900

The first article explains what are the best practices on how to manipulate VBOs (max size, format etc).

The second presents a simple example on how to manage vertex memory using batches. According to the author each batch HAS TO contains an instance of a VBO (plus a VAO) and he insists strongly on the fact that the maximimum size of a VBO is ranged between 1Mo (1000000 bytes) to 4Mo (4000000 bytes). The first article advice the same thing. I quote "1MB to 4MB is a nice size according to one nVidia document. The driver can do memory management more easily. It should be the same case for all other implementations as well like ATI/AMD, Intel, SiS."

I have several questions:

1) Does the maximum byte size mentionned above is an absolute rule ? Is it so bad to allocate VBO with a byte size more important than 4Mo (for example 10 Mo) ?

2) How can we do concerning meshes with a total vertex byte size larger than 4Mo? Do I need to split the geometry into several batches?

3) Does a batch inevitably store as attribute a unique VBO or several batches can be store in a single VBO ? (It's two different ways but the first one seems to be the right choice). Are you agree ?

According to the author each batch handle a unique VBO with a maximum size between 1 and 4 Mo and the whole VBO HAS TO contain only vertex data sharing the same material and transformation information). So if I have to batch an other mesh with a different material (so the vertices can't be merged with existing bathes) I have to create a new batch with a NEW vbo instanciated.

So according to the author my second method is not correct : it's not adviced to store several batches into a single VBO.

score 5 · Accepted Answer · answered Feb 03 '16 at 15:25

Does the maximum byte size mentionned above is an absolute rule ? Is it so bad to allocate VBO with a byte size more important than 4Mo (for example 10 Mo) ?

No.

That was a (very) old piece of info that is not necessarily valid on modern hardware.

The issue that led to the 4MB suggestion was about the driver being able to manage memory. If you allocated more memory than the GPU had, it would need to page some in and out. If you use smaller chunks for your buffer objects, the driver is more easily able to pick whole buffers to page out (because they're not in use at present).

However, this does not matter so much. The best thing you can do for performance is to avoid exceeding memory limits entirely. Paging things in and out hurts performance.

So don't worry about it. Note that I have removed this "advice" from the Wiki.

So according to the author my second method is not correct : it's not adviced to store several batches into a single VBO.

I think you're confusing the word "batch" with "mesh". But that's perfectly understandable; the author of that document you read doesn't seem to recognize the difference either.

For the purposes of this discussion, a "mesh" is a thing that is rendered with a single rendering command, which is conceptually separate from other things you would render. Meshes get rendered with certain state.

A "batch" refers to one or more meshes that could have been rendered with separate rendering commands. However, in order to improve performance, you use techniques to allow them all to be rendered with the same rendering command. That's all a batch is.

"Batching" is the process of taking a sequence of meshes and making it possible to render them as a batch. Instanced rendering is one form of batching; each instance is a separate "mesh", but you are rendering lots of them with one rendering call. They use their instance count to fetch their per-instance state data.

Batching takes many forms beyond instanced rendering. Batching often happens at the level of the artist. While the modeller/texture artist may want to split a character into separate pieces, each with their own textures and materials, the graphics programmer tells them to keep them as a single mesh that can be rendered with the same textures/materials.

With better hardware, the rules for batching can be reduced. With array textures, you can give each mesh a particular ID, which it uses to pick which array layer it uses when fetching textures. This allows the artists to give such characters more texture variety without breaking the batch into multiple rendering calls. Ubershaders are another form, where the shader uses that ID to decide how to do lighting rather than (or in addition to) texture fetching.

The kind of batching that the person you're citing is talking about is... well, very confused.

What do you think about that?

Well, quite frankly I think the person from your second link should be ignored. The very first line of his code: class Batch sealed is not valid C++. It's some C++/CX Microsoft invention, which is fine in that context. But he's trying to pass this off as pure C++; that's not fine.

I'm also not particularly impressed by his code quality. He contradicts himself a lot. For example, he talks about the importance of being able to allocate reasonable-sized chunks of memory, so that the driver can more freely move things around. But his GuiVertex class is horribly bloated. It uses a full 16 bytes, four floats, just for colors. 4 bytes (as normalized unsigned integers) would have been sufficient. Similarly, his texture coordinates are floats, when shorts (as unsigned normalized integers) would have been fine for his use case. That would cut his per-vertex cost down from 32 bytes to 10; that's more than a 3:1 reduction.

4MB goes a lot longer when you use reasonably sized vertex data. And the best part? The OpenGL Wiki page he linked to tells you to do exactly this. But he doesn't do it.

Not to mention, he has apparently written this batch manager for a GUI (as alluded to by his GuiVertex type). Yet GUIs are probably the least batch-friendly rendering scenario in game development. You're frequently having to change state like bound textures, the current program (which reads from the texture or not), blending modes, the scissor box, etc.

Now with modern GPUs, there are certainly ways to make GUI renderers a lot more batch-friendly. But he never talks about them. He doesn't mention techniques to use gl_ClipDistance as a way to do scissor boxes with per-vertex data. He doesn't talk about ubershader usage, nor does his vertex format provide an ID that would allow such a thing.

As previously stated, batching is all about not having state changes between objects. But he focuses entirely on vertex state. He doesn't talk about textures, programs, etc. He doesn't talk about techniques to allow multiple objects to be part of the same batch while still having separate transforms.

His class can't really be used for batching of anything that couldn't have just been a single mesh.

Thank you very much for this very interesting answer :)! When I wrote the first version of my post I understood correctly the distinction between a mesh and a batch! My main problem was target on the maximum size of a VBO and how to organize my batches in memory (does a batch is stored into a unique VBO or does a VBO can contain several batches). I've updated my post and gave you the example of the Sponza mesh with 3 propositions to render it! I want to know what do you think about this! Thank you very much in advance! :) — user1364743, Feb 04 '16 at 17:20
@user1364743: OK, you have changed your question into a *completely different question*, which warrants a completely different answer. If you want to ask that question, that's fine. That's what the "Ask A Question" button is for. Stack Overflow is a Q&A site, not a forum. You don't ask followup questions by appending it to your existing question. — Nicol Bolas, Feb 04 '16 at 17:23
Ok! I've just added my post. Here's the link: http://stackoverflow.com/questions/35207922/what-is-the-best-way-to-render-a-complex-mesh-using-vertex-batching — user1364743, Feb 04 '16 at 17:40

Vertex batches (geometry groups) and maximum VBO (vertex buffer) size

1 Answers1

Linked