Performance of GLSL geometry shaders unexpectedly slow

Question

I'm trying to learn how to program GLSL Geometry shaders. My test project works like this: I have N VBO's which are modeling "blades of grass". Without the shader, each blade of grass is just basically a line strip with 20 segments. I was able to get this animating more or less smoothly with almost N=10k blades so that's 200,000 lines.

The shader takes each line segment and blows it out to a cylinder of the same length centered on that line segment, so the blades of grass are now tubes with dimensionality. So nothing has changed in the CPU, but now I'm trying to leverage the GPU to add more geometry so I can shade the blades. The cylinder has 30 sections so that's 60 triangles, 1200 triangles per blade.

The thing is, to get it to animate smoothly I had to scale back to only 25 blades. That's only 30k triangles, which is basically LESS geometry than I was dealing with before when I wasn't using shaders at all.

This is running on a Macbook Pro, Snow Leopard, AMD Radeon HD 6750M. No idea if that's a good card or not.

The shader code is pretty simple -- the vertex shader just has gl_Position = gl_Vertex. Lighting is happening in the geometry shader: simple ambient, specular and diffuse components, basically straight out of the tutorials. Fragment shader is similarly simplistic, just multiplies the grass color by the light intensity that was passed over from the geometry shader.

This is an old version of OpenGL by the way, 2.1 -- so it's GLSL 1.2, so to use the geo shader it needs the GL_EXT thingy. In case that's relevant.

Also, the stack is Processing on top of GLGraphics on top of JOGL on top of Java. I'd be surprised if that was a factor, unless somehow it's emulating the shader code on the CPU but I didn't think OpenGL did that kind of thing automatically for you.

Anyway, do these numbers seem reasonable, or am I doing something wrong? Am I unrealistically expecting geo shaders to work miracles?

I'd try this code on a card that is sure to run the GS on hardware as opposed to the driver emulating that functionality. Perhaps some new Fermi-class hardware? I have a strong feeling it will work great :) — Ani, Mar 12 '12 at 15:59
@ananthonline: Are you saying that AMD's HD-class hardware *doesn't* have hardware geometry shaders? Because that's not true. — Nicol Bolas, Mar 12 '12 at 17:21
Not really sure, hence the "try" bit. Also, it might be bad drivers or something else. The only way to make sure is to try in on some other card+driver combination that works well for someone else. Correct? — Ani, Mar 12 '12 at 20:26

Nicol Bolas · Accepted Answer · 2012-03-12T18:53:44.673

5

No one has ever accused Geometry Shaders of being fast. Especially when increasing the size of geometry.

Your GS is taking a line and not only doing a 30x amplification of vertex data, but also doing lighting computations on each of those new vertices. That's not going to be terribly fast, in large part due to a lack of parallelism. Each GS invocation has to do 60 lighting computations, rather than having 60 separate vertex shader invocations doing 60 lighting computations in parallel.

You're basically creating a giant bottleneck in your geometry shader.

It would probably be faster to put the lighting stuff in the fragment shader (yes, really).

edited Mar 12 '12 at 18:53

answered Mar 12 '12 at 17:28

Nicol Bolas

449,505
63
781
982

That's interesting... ever since I learned about geo shaders I've been wondering why they don't come *before* the vertex shader. That would seem much more logical -- tesselate first, then do lighting in the vertex shader, then move on to the fragment shader. I'm just not sure I really "get" geometry shaders -- what they are really used for/good for. – eeeeaaii Mar 13 '12 at 03:52
2

@eeeeaaii: Geometry shaders aren't for tessellating; they can do it, but there's a reason why DX11/GL4.x added shaders specifically for tessellation. Geometry shaders are for doing things at the per-primitive level. Their main use is for layered rendering, where you can render different objects to different layers of a [layered render target](http://www.opengl.org/wiki/Framebuffer_Object#Layered_Images). You can also use them for point-sprites that work correctly (clipping to the size of the sprite, not to the center of the sprite). – Nicol Bolas Mar 13 '12 at 04:32

Performance of GLSL geometry shaders unexpectedly slow

1 Answers1

Linked