OpenGL: Rendering thousands of cubes with Vertex Arrays, not working too well

Question

I am attempting to use vertex arrays to render about 2097152 cubes with LWJGL (no not all of them at once). I have implemented numerous types of polygon culling to enhance my performance from around 2 FPS to about 60 FPS. Throughout this project, I have been working with Immediate Mode rendering, and I think it is time for an upgrade. And here is where vertex arrays come in.

I don't want to use VBOs so I have been experimenting with VAOs for now. I cannot seem to get a practical (or efficient) rendering method down. Everything I try gives me worse FPS than Immediate Mode, sadly to say. Every frame I load up a FloatBuffer for every cube that has visible polygons then draw them using the common vertex array methods. This setup gives me headaches because I get less FPS than while using Immediate Mode and not culling any polygons.

I think I am doing something wrong. So among all you bright, aspiring OpenGL/LWJGL programmers out there, does anyone know how this can be done in a more effective and efficient way?

Here is my code for render (truncated to not be too much of a mess):

for(int z = 0; z < chunk.bpc; z++) {
for(int y = 0; y < chunk.bpc; y++) {
    for(int x = 0; x < chunk.bpc; x++) {
        if(((z == chunk.bpc - 1 || z == 0) || (y == chunk.bpc - 1 || y == 0) || (x == chunk.bpc - 1 || x == 0)) 
            && chunk.data[(x * chunk.bpc + z) * chunk.bpc + y] == i) {

                List<Float> vertices = new ArrayList<Float>();

                float xp = x + locX, yp = y + locY, zp = z + locZ;

            if(z == chunk.bpc - 1 && chunk.z$ == null) {
                vertices.add(xp); vertices.add(yp); vertices.add(zp + size);
                vertices.add(xp + size); vertices.add(yp); vertices.add(zp + size);
                vertices.add(xp + size); vertices.add(yp + size); vertices.add(zp + size);
                vertices.add(xp); vertices.add(yp + size); vertices.add(zp + size);
            }

            if(z == 0 && chunk.z_ == null) {
                vertices.add(xp); vertices.add(yp); vertices.add(zp);
                vertices.add(xp); vertices.add(yp + size); vertices.add(zp);
                vertices.add(xp + size); vertices.add(yp + size); vertices.add(zp);
                vertices.add(xp + size); vertices.add(yp); vertices.add(zp);
            }

            if(y == chunk.bpc - 1 && chunk.y$ == null) {
                vertices.add(xp); vertices.add(yp + size); vertices.add(zp);
                vertices.add(xp); vertices.add(yp + size); vertices.add(zp + size);
                vertices.add(xp + size); vertices.add(yp + size); vertices.add(zp + size);
                vertices.add(xp + size); vertices.add(yp + size); vertices.add(zp);
            }

            if(y == 0 && chunk.y_ == null) {
                vertices.add(xp); vertices.add(yp); vertices.add(zp);
                vertices.add(xp + size); vertices.add(yp); vertices.add(zp);
                vertices.add(xp + size); vertices.add(yp); vertices.add(zp + size);
                vertices.add(xp); vertices.add(yp); vertices.add(zp + size);
            }

            if(x == chunk.bpc - 1 && chunk.x$ == null) {
                vertices.add(xp + size); vertices.add(yp); vertices.add(zp);
                vertices.add(xp + size); vertices.add(yp + size); vertices.add(zp);
                vertices.add(xp + size); vertices.add(yp + size); vertices.add(zp + size);
                vertices.add(xp + size); vertices.add(yp); vertices.add(zp + size);
            }

            if(x == 0 && chunk.x_ == null) {
                vertices.add(xp); vertices.add(yp); vertices.add(zp);
                vertices.add(xp); vertices.add(yp); vertices.add(zp + size);
                vertices.add(xp); vertices.add(yp + size); vertices.add(zp + size);
                vertices.add(xp); vertices.add(yp + size); vertices.add(zp);
            }

            float[] verts = new float[vertices.size()];
            for(int a = 0; a < verts.length; a++) {
                verts[a] = vertices.get(a);
            }

            FloatBuffer cubeBuffer = BufferUtils.createFloatBuffer(verts.length);
            cubeBuffer.put(verts);
            cubeBuffer.flip();

            GL11.glEnableClientState(GL11.GL_VERTEX_ARRAY);

            GL11.glVertexPointer(3, 0, cubeBuffer);

            GL11.glDrawArrays(GL11.GL_QUADS, 0, verts.length / 3);

            GL11.glDisableClientState(GL11.GL_VERTEX_ARRAY);
        }
    }
}
}

(Just ignore some of those variables, they just cull the different polygons)

So I don't know if there is a more efficient way to do this, but if there is, it would be nice if I could get some pointers. Thanks in advance! Oh and...

THERE IS SOMETHING EXTREMELY WRONG WITH HOW I AM RENDERING.

score 1 · Answer 1 · answered Jan 27 '13 at 19:31

1

Don't regenerate your geometry every frame. Dynamic memory allocations in the inner rendering loop are generally a bad idea.

Generate the geometry for a chunk once when a chunk comes into view and if/when that chunk is modified.

Maybe use a LRU cache to store the geometry so that non-visible chunks slowly get purged from the cache.

answered Jan 27 '13 at 19:31

genpfault

51,148
11
85
139

I guess my main question now is can I load all my vertices for all my geometry into a single buffer then render from that instead of a new buffer for every single cube...? – CoderTheTyler Jan 27 '13 at 19:56
@MrDoctorProfessorTyler: Are you really asking us how to move your code out of our rendering loop and into some initialization code? – Nicol Bolas Jan 27 '13 at 20:50
I am completely new to this so, yes I am. Or at least point me in the right direction – CoderTheTyler Jan 27 '13 at 21:20

score 1 · Accepted Answer · answered Jan 27 '13 at 19:37

1

Although I have read about LWJGL and VAOs I have never used them personally, VBOs always did the trick for me. But if I look at your code snippet you seem to calling this snippet every frame. So, in essence you are changing the data each frame, create a new buffer, transfer the data from to the buffer, and then render the data in the buffer. Creating a new buffer each frame is expensive, so do this once and then reuse your buffer. And if you are changing the data each frame, then VAOs and VBOs probably will not give you more performance than immediate mode. The reason lies in the fact that in immediate mode you transfer the data each frame to GPU memory and render it, this transferring is expensive. On the other hand, if the data does not change each frame then VAOs and VBOs (and earlier display lists) give you a speedup by allowing you to store the data in GPU memory, so you don't have transfer it each time from RAM over the PCI-E bus to GPU memory.

answered Jan 27 '13 at 19:37

Bart

166
5

Can I have a single buffer that stores the data for all the vertices instead of creating a new buffer for each cube? – CoderTheTyler Jan 27 '13 at 19:53
yes. You should put a lot more cubes in a buffer. There's a certain optimum how much vertices you should store in a buffer, but you can store millions of vertices in it, which is a lot more than the handful of vertices you store in it now. – Bart Jan 27 '13 at 20:02
Can you point me anywhere to a tutorial or example of how to make that work? I already tried adding a lot more vertices but still failed horribly as it only rendered one cube. – CoderTheTyler Jan 27 '13 at 20:16
I don't know anything by heart, but try searching the web, this is fairly basic OpenGL stuff, so I would guess several examples/tutorials are online. Pick one that fits your level of expertise. OpenGL's wiki about buffer objects and vertex specification may also be useful to you: http://www.opengl.org/wiki/Buffer_Object and http://www.opengl.org/wiki/Vertex_Specification – Bart Jan 27 '13 at 20:34

OpenGL: Rendering thousands of cubes with Vertex Arrays, not working too well

2 Answers2