I am trying to implement Auto Exposure for HDR Tone mapping and I am trying to reduce the cost of finding the average brightness of my scene and I've seemed to hit a choke point with glReadPixels
. Here is my setup:
1: I create a downsampled FBO to reduce the cost of reading when using glReadPixels
using only the GL_RED
values and in GL_BYTE
format.
private void CreateDownSampleExposure() {
DownFrameBuffer = glGenFramebuffers();
DownTexture = GL11.glGenTextures();
glBindFramebuffer(GL_FRAMEBUFFER, DownFrameBuffer);
GL11.glBindTexture(GL11.GL_TEXTURE_2D, DownTexture);
GL11.glTexImage2D(GL11.GL_TEXTURE_2D, 0, GL11.GL_RED, 1600/8, 1200/8,
0, GL11.GL_RED, GL11.GL_BYTE, (ByteBuffer) null);
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
GL11.GL_TEXTURE_2D, DownTexture, 0);
if (glCheckFramebufferStatus(GL_FRAMEBUFFER) != GL_FRAMEBUFFER_COMPLETE) {
System.err.println("error");
} else {
System.err.println("success");
}
GL11.glBindTexture(GL11.GL_TEXTURE_2D, 0);
glBindFramebuffer(GL_FRAMEBUFFER, 0);
}
2: Setting up the ByteBuffers and reading the texture of the FBO texture Above.
Setup(){
byte[] testByte = new byte[1600/8*1000/8];
ByteBuffer testByteBuffer = BufferUtils.createByteBuffer(testByte.length);
testByteBuffer.put(testByte);
testByteBuffer.flip();
}
MainLoop(){
//Render scene and store result into downSampledFBO texture
GL11.glBindTexture(GL11.GL_TEXTURE_2D, DeferredFBO.getDownTexture());
//GL11.glGetTexImage(GL11.GL_TEXTURE_2D, 0, GL11.GL_RED, GL11.GL_BYTE,
//testByteBuffer); <- This is slower than readPixels.
GL11.glReadPixels(0, 0, DisplayManager.Width/8, DisplayManager.Height/8,
GL11.GL_RED, GL11.GL_BYTE, testByteBuffer);
int x = 0;
for(int i = 0; i <testByteBuffer.capacity(); i++){
x+= testByteBuffer.get(i);
}
System.out.println(x); <-Print out accumulated value of brightness.
}
//Adjust exposure depending on brightness.
The problem is, I can downsample my FBO texture by a factor of 100, so my glReadPixels
reads only 16x10 pixels and there is little to no performance gain. There is a substantial performance gain from no downsampling but once I get past around dividing the width and height by 8 it seems to fall off. It seems like there is such a huge overhead of just calling this function. Is there something I am doing incorrectly or not considering when calling glReadPixels
?.