Copying a non-multisampled FBO to a multisampled one

Question

I have been trying to implement a render to texture approach in our application that uses GLES 3 and I have got it working but I am a little disappointed with the frame rate drop.

So far we have been rendering directly to the main FBO, which has been a multisampled FBO created using EGL_SAMPLES=8.

What I want basically is to be able to get a hold of the pixels that have been already drawn, while I'm still drawing. So I thought a render to texture approach should do it. Then I'd just read a section of the off-screen FBO's texture whenever I want and when I'm done rendering to it I'll blit the whole thing to the main FBO.

Digging into this I found I had to implement a system with a multisampled FBO as well as a non-multisampled textured FBO in which I had to resolve the multisampled one. Then just blit the resolved to the main FBO.

This all works but the problem is that by using the above system and a non-multisampled main FBO (EGL_SAMPLES=0) I get quite a big frame rate drop compared to the frame rate I get when I use just the main FBO with EGL_SAMPLES=8.

Digging a bit more into this I found people reporting online as well as a post here https://community.arm.com/thread/6925 that says that the fastest approach to multisampling is to use EGL_SAMPLES. And indeed that's what it looks like on the jetson tk1 too which is our target board.

Which finally leads me to the question, and apologies for the long introduction:

Is there any way that I can design this to use a non-multisampled off-screen fbo for all the rendering that eventually is blitted to a main multisampled FBO that uses EGL_SAMPLES?

That last paragraph is completely backwards. You do not start by rasterizing 1 sample per-pixel and then blit that into a multisampled framebuffer. You want to go the other direction and resolve a multisampled buffer into a single sample, otherwise you're just wasting VRAM. — Andon M. Coleman, Dec 04 '16 at 00:11

solidpixel · Answer 1 · 2016-12-05T22:12:21.987

The only point of MSAA is to anti-alias geometry edges. It only provides benefit if multiple triangle edges appear in the same pixel. For rendering pipelines which are doing multiple off-screen passes you want to enable multiple samples for the off-screen passes which contain your geometry (normally one of the early passes in the pipeline, before any post-processing effects).

Applying MSAA at the end of the pipeline on the final blit will provide zero benefit, and probably isn't free (it will be close to free on tile-based renderers like IMG Series 6 and Mali (the blog you linked), less free on immediate-mode renders like the Nvidia in your Jetson board).

Note for off-screen anti-aliasing the "standard" approach is rendering to an MSAA framebuffer, and then resolving as a second pass (e.g. using glBlitFramebuffer to blit into a single sampled buffer). This bounce is inefficient on many architectures, so this extension exists to help:

https://www.khronos.org/registry/gles/extensions/EXT/EXT_multisampled_render_to_texture.txt

Effectively this provides the same implicit resolve as the EGL window surface functionality.

Answers to your questions in the comments.

Is the resulting texture a multisampled texture in that case?

From the application point of view, no. The multisampled data is inside an implicitly allocated buffer, allocated by the driver. See this bit of the spec:

"The implementation allocates an implicit multisample buffer with TEXTURE_SAMPLES_EXT samples and the same internalformat, width, and height as the specified texture level."

This may require a real MSAA buffer allocation in main memory on some GPU architectures (and so be no faster than the manual glBlitFramebuffer approach without the extension), but is known to be effectively free on others (i.e. tile-based GPUs where the implicit "buffer" is a small RAM inside the GPU, and not in main memory at all).

The goal is to blur the background behind widgets

MSAA is not in any way a general purpose blur - it only anti-aliases the pixels which are coincident with edges of triangles. If you want to blur triangle faces you'd be better off just using a separable gaussian blur implemented as a pair of fragment shaders, and implement it as a 2D post-processing pass.

Yea the solution I tried was what you describe here. Render to offscreen MSAA FBO (with multisampled render buffer and not texture as we're on GL ES3 or lower) then resolve to a singlesample FBO with a texture and then use glBlitFrameBuffer to blit it to the main FBO. Its slow. Especially at 1920x1080 resolution. But this extension you talk about sounds interesting. Is the resulting texture a multisampled texture in that case? The best solution for us would be to be able to somehow have a 2d texture of the scene that we can access at any time. The goal is to blur the background behind widgets. — Boofish, Dec 05 '16 at 15:30
Thanks for your answer. What I actually meant by blur though was that I would like to copy a section of whatever has been already drawn and redraw it blurred, while I'm still in my render loop. That's why I'm trying to render to a texture. The problem is that I use MSAA for anti-aliasing and that complicates the "render to texture" step. I mentioned the blur to sort of state the goal of this ordeal in case you knew of any easier ways to achieve this. I already have a one-pass gaussian blur shader that works well. Just need to be able to sample what's already in my MSAA FBO. — Boofish, Dec 05 '16 at 23:40

score 0 · Answer 2 · answered Dec 04 '16 at 02:57

Is there any way that I can design this to use a non-multisampled off-screen fbo for all the rendering that eventually is blitted to a main multisampled FBO that uses EGL_SAMPLES?

Not in any way which is genuinely useful.

Framebuffer blitting does allow blits from single-sampled buffers to multisample buffers. But all that does is give every sample within a pixel the same value as the one from the source.

Blitting cannot generate new information. So you won't get any actual antialiasing. All you will get is the same data stored in a much less efficient way.

Copying a non-multisampled FBO to a multisampled one

2 Answers2