1

I am working on a voxel game engine using OpenGL binding for JVM languages (scala is my case) - LWJGL 3 for OpenGL version 4.5 . Currently I'm stuck with chunk rendering (32*32*32 blocks). To render any object I firstly give it an unique ID, treating similar objects, like simple blocks, as one with different transformations, create one VAO data thing in initialization stage and after all preparation is done, I render a whole chunk looping through each block, passing its data to the shader and then calling drawElements with appropriate offset, taken from ID. This way fps drops from 3000(with 3 axis lines and a huge grid object rendered separately) to 1-2. So how should I correctly render a chunk ?

I used basic-block-rendering tutorial as reference.

Fps dropping code:

def render(shader:Shader): Unit ={
for(x <- 0 until SIZE) {
  for (y <- 0 until SIZE) {
    for (z <- 0 until SIZE) {
      val obj = blocks(x)(y)(z)
      if(obj != null){
        val M = obj.getTransformationMatrix() * Matrix4F.matrixTRANSLATION(obj.getPosition())
        shader.setUniformMat4f("M", M)
        shader.setUniformMat4f("MI", M.inverse())
        shader.setUniformBool("lightInteraction", obj.lightInteraction)
        shader.setUniform1f("smoothness", obj.smoothness)
        shader.setUniform3f("matD", obj.matDiffuse)
        shader.setUniform3f("matS", obj.matSpecular)
        GL13.glActiveTexture(GL13.GL_TEXTURE0); // Texture unit 0
        glBindTexture(GL_TEXTURE_2D, obj.getTextureID())

        Shader.full.setUniform1i("tex", 0)
        RenderRegistry.getRenderManager().render(obj.getID, obj.getRenderType())
      }
    }
  }
}
}
Tejus Prasad
  • 6,322
  • 7
  • 47
  • 75
Russoul
  • 137
  • 1
  • 6

1 Answers1

1

Your current code is attempting to perform 32768 (32*32*32) matrix multiplications, inverses, uniform uploads and texture state changes per frame. These are, in general, expensive operations and it's likely they aren't all necessary for every single object in the chunk. Fortunately, there are few ways to optimize.

Group by texture/material

Chances are some of the objects share the same shader parameters and textures. Binding these values and THEN drawing all objects that share them should get back some performance.

Instanced rendering

I won't get into this too much since there's an excellent tutorial here on the subject. Assuming all your voxels use the same geometry and are mostly static, you can upload your transformation matrices in a buffer and upload the vertex object once. The benefit of this will depend partly on your graphics driver but for large voxel scenes it'll probably be worth it.

Move matrix operations to the GPU

This is minor, but if you don't feel like bothering with instancing, you can at least save some cycles on the matrix operations. Don't upload M.inverse() and instead call inverse(M) on the GPU. Also consider uploading the voxel position and transform separately as a uniform and doing transform*position on the GPU.

xlem
  • 349
  • 1
  • 6
  • Actually matrix-operations that are stored in a uniform should basically always be pre-calculated on the CPU. Why calculate something for each vertex (n-times) on the GPU if you can calculate it once (on the CPU). If n is a high number the GPU won't be any faster, but actually the contrary will be the case. GPUs are not magical, you know. See: http://stackoverflow.com/questions/16620013/should-i-calculate-matrices-on-the-gpu-or-on-the-cpu –  Sep 16 '15 at 14:05