41

I'm working on a shader manager architecture and I have several questions for more advanced people. My current choice oppose two designs which are:


1. Per material shader program

=> Create one shader program per material used in the program.

Potential cons:

  • Considering every object might have its own material, it involves a lot of glUseProgram calls.
  • Implies the creation of a lot of shaderprogram objects.
  • More complex architecture that #2.

Pros:

  • Shader code can be generated specifically for each "options" used in the material.
  • If i'm not wrong, uniforms have to be set only one time (when the shaderprogram is created).


2. Global shader programs

=> Create one shader program per shader functionality (lightning, reflection, parallax mapping...) and use configuration variables to enable or discard options depending on the material to render.

Potential cons:

  • Uniforms have to be changed many times per frame.

Pros:

  • Lower shader programs count.
  • Less SP swich (glUseProgram).


You might notice that my current tendency is #1, but I wanted to know your opinion about it.

  • Does initial uniforms setting offset the glUseProgram call overhead (I'm not especially speed freak) ?
  • In the case #1, for any memory or performance consideration, should I call glLinkProgram only once when I create the SP, or I must unlink/link each time I call glUseProgram?
  • Are there better solutions ?

Thanks!

Profet
  • 944
  • 1
  • 15
  • 22
  • "If i'm not wrong, uniforms have to be set only one time (when the shaderprogram is created)." This would only be the case if you have as many materials as you do objects in the scene, is that the case? – ds-bos-msk Dec 26 '13 at 03:52

3 Answers3

10

Let's look at #1:

Considering every object might have its own material, it involves a lot of glUseProgram calls.

This isn't that big of a deal, really. Swapping programs is hard, but you'd be swapping textures too, so it's not like you're not already changing important state.

Implies the creation of a lot of shaderprogram objects.

This is going to hurt. Indeed, the main problem with #1 is the explosive combination of shaders. While ARB_separate_program_objects will help, it still means you have to write a lot of shaders, or come up with a way to not write a lot of shaders.

Or you can use deferred rendering, which helps to mitigate this. Among its many advantages is that it separates the generation of the material data from the computations that transform this material data into light reflectance (colors). Because of that, you have far fewer shaders to work with. You have a set of shaders that produces material data, and a set that uses the material data to do lighting computations.

So I would say to use #1 with deferred rendering.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • Precisely, I planned to use deferred rendering. I read this from the Blizzard's starcraft II engine paper and I was not sure of its exact meaning: `Thus, writing shader code in our game is generally very close to adding a regular C++ file in our project. Figuratively, we treat the shader code as an external library called from C++, with the shader code being a free form body of code organized structurally as a C++ codebase would.` – Profet Jul 01 '11 at 01:29
  • `Thus, the concept of shaders can be a loose one in Starcraft II – the shader code library defines several entry points that translate to different shaders, but one entry point is free to call into any section of shader code. Thus it would be difficult to talk about how many “individual” shaders Starcraft II uses, as it is a single body of code from which more than a thousand shader permutations will generally be derived for a single video options configuration.` – Profet Jul 01 '11 at 01:31
  • @Profet: That has nothing really to do with deferred rendering. That's part of the machinery that Blizzard set up in their engine to handle shaders. – Nicol Bolas Jul 01 '11 at 01:36
  • Yes I know, but this machinery looks like a kind of dynamic shaders generator, doesn't it ? :) – Profet Jul 01 '11 at 01:53
  • Can anyone just give any example to maintain the uniformity through different shaders without duplicating code among shaders? The first thing comes to my mind is to be able to keep the MVP matrix same through the shaders. Are there any example code for this? – rgngl Oct 28 '11 at 10:58
  • I think you must define you own uniforms for modelview/projection matrices since ftransform and such things are now deprecated. I'm not sure to get what you mean by "shaders uniformity" but I don't think it's possible to keep these values shared between multiples shaders. – Profet Nov 03 '11 at 15:23
  • @Profet: He's talking about copying code between different shader files. The layout and names of matrices, functions that are used in different shaders, etc. – Nicol Bolas Nov 03 '11 at 17:30
9

It really depends on your hardware and the specific demands of your app.

Another con of #2 is that your shader usually ends up not being as efficient because it has to do some conditional branching based on the uniforms you pass in. So you're basically trading off between less time switching state versus decreased throughput in your shader. It depends on which one is worse.

You should definitely only call glLinkProgram once per shader. Compiling a shader takes much longer than switching out already-compiled shaders.

There aren't really any better solutions. Pretty much everyone writing a rendering engine has to make the decision you're faced with.

Nathan Monteleone
  • 5,430
  • 29
  • 43
  • 1
    To emphasize, branching in vertex/fragment shaders is bad. It will not get better with more complex GPUs (eg, longer pipelines to flush on branch). – fscan Jan 18 '13 at 19:40
  • 1
    I don't think that's quite right -- branching in the shader itself doesn't require a pipeline flush. Changing shaders does. – Nathan Monteleone Jan 21 '13 at 18:48
  • 6
    As far as i understand, modern cards store state in state registers and send the name of the register with the commands through the pipline to prevent flushing on shader/texture change. On the other hand, if a branch is predicted wrong it has to abort all the work done (like texture fetches .. extremely high latency) and start new. nice read: http://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/ – fscan Jan 22 '13 at 02:46
  • 1
    @fscan: "branching in vertex/fragment shaders is bad. It will not get better with more complex GPUs" Thats not really true anymore. The PS4 GPU can handle branching really well. It can even be used to improve performance! – Tara Oct 23 '13 at 08:52
2

On mobile devices, branching in a shader greatly increases time to render. You should do some measuring on the time to switch programs vs the reduced draw rate associated with branching in a per vertex / per texel repeating operation. I'd recommend method #1 and take a look at how GPUImage is set up for a good oop friendly shader architecture.

https://github.com/BradLarson/GPUImage

Slynk
  • 527
  • 5
  • 9
  • Dynamic branching can, still even on desktop, be very costly as, dependent on the gpu architecture, all branches may need to be evaluated if any of them are taken in a 2x2 (iirc) grid. I don't really remember but branching should really be avoided at all cost (pun intended) – RecursiveExceptionException Aug 27 '16 at 19:18