5

I'm looking for a way to improve my particle system performance, since it is very costly in terms of FPS. This is because I call:

glDrawElements(GL_TRIANGLE_STRIP, mNumberOfIndices,
          GL_UNSIGNED_SHORT, 0);

I call this method for every particle in my application (which could be between 1000 and 5000 particles). Note that when increasing to over 1000 particles my application starting to drop down in FPS. I'm using VBO:s, but the performance when calling this method is too costly.

Any ideas how I can make the particle system more efficient?

Edit: This is how my particle system draw things:

glBindTexture(GL_TEXTURE_2D, textureObject);
glBindBuffer(GL_ARRAY_BUFFER, vboVertexBuffer[0]);
glVertexPointer(3, GL_FLOAT, 0, 0);
glBindBuffer(GL_ARRAY_BUFFER, vboTextureBuffer[0]);
glTexCoordPointer(2, GL_FLOAT, 0, 0); 
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, vboIndexBuffer[0]);

Vector3f partPos;

for (int i = 0; i < m_numParticles; i++) {
    partPos = m_particleList[i].m_pos;          
    glTranslatef(partPos.x, partPos.y, partPos.z);
    glDrawElements(GL_TRIANGLE_STRIP, mNumberOfIndices, 
        GL_UNSIGNED_SHORT, 0);
    gl.glTranslatef(-partPos.x, -partPos.y, -partPos.z);
}
genpfault
  • 51,148
  • 11
  • 85
  • 139
Curtain
  • 1,972
  • 3
  • 30
  • 51
  • 2
    what datatypes are you using for coordinates etc? if youre using other than GL_FLOAT, then you can see (huge) performance drop. – Rookie Jul 25 '11 at 10:02
  • @Rookie: I have float values. – Curtain Jul 25 '11 at 11:35
  • I know the post is a bit old but for anyone wondering, here is an amazing tutorial I found. (I'm not the author of this tutorial). http://www.opengl-tutorial.org/intermediate-tutorials/billboards-particles/particles-instancing/ – Thibault Reuille Apr 15 '14 at 00:13

2 Answers2

7

The way you describe it, it sounds like you have a own VBO for each particle. This is not how it should be done. Put all particles into a single VBO and draw them all at once using a single glDrawElements or glDrawArrays call. Or even better, if available: Use instancing.

datenwolf
  • 159,371
  • 13
  • 185
  • 298
  • Thanks for the answer, but I reuse the same VBO for every particle. And then I just loop through every particle using `glDrawElements()`. The problem is that it are too many calls to `glDrawElements()`, and I'm not clear enough about how this should be solved. – Curtain Jul 25 '11 at 11:35
  • The answer is clear: draw the whole particle system with a single draw call. – kvark Jul 25 '11 at 12:10
  • @kvark: That wasn't a big wake up for me. The question is which is the best way to go since every particle needs their own position? Modifying the VBO during draw time? – Curtain Jul 25 '11 at 13:36
  • Put them all at once into a single VBO. Make it a single (large) mesh of unconnected faces. Particle 1 and indices 0,1,2,3, particle 2 at indices 4,5,6,7 and so on. – datenwolf Jul 25 '11 at 13:41
  • @datenwolf: Thanks, will try this out when I'm at home. :) – Curtain Jul 25 '11 at 13:44
  • 3
    Julian Assange. The point is: particle position data *nevertheless* goes from your system memory into VRAM for drawing. In your current implementation, it travels in form of internal uniforms, set via glTranslate. If you make particle position a part of the VBO, changing it per frame, it will still be the same amount of data travelling the bus. But the drawing operation will be way faster. – kvark Jul 25 '11 at 18:22
2

Expanding a bit on what datenwolf said, just pack all your particle indices into a single index buffer and draw all particles with a single glDrawElements call. This means you cannot use triangle strips anymore but a triangle set, but that shouldn't be too much of a problem.

Otherwise, if your hardware supports instanced rendering (or better, instanced arrays), you can do it by just rendering a single particle n times with the position and texCoord data taken from the respective arrays for each particle. You then still need to compute the four corner's position and texCoord data in the vertex shader (assuming you draw a quad for each particle), as with the instanced arrays you only get one attribute per instance (particle).

You might also use the geometry shader to create the particle's quad and just render a single point set, but I assume this might be slower than instancing, considering that SM4/GL3 hardware is quite likely to support instancing, too.

Christian Rau
  • 45,360
  • 10
  • 108
  • 185
  • Thanks for answering. Okay, then I need to put all particle's into a single VBO which holds the correct x, y and z coordinates (and I can just skip translate them) because they need to be rendered in different positions. However, is it efficient to upload a new VBO into OpenGL during draw time since the positions always change? – Curtain Jul 25 '11 at 13:33
  • 1
    @Julian Just make the position buffer `GL_DYNAMIC_DRAW` (or even `GL_STREM_DRAW`?) and update it every frame (using `glMapBuffer` or `glBuffer(Sub)Data`), It should at least be faster as your current solution. – Christian Rau Jul 25 '11 at 13:36
  • Thanks, will try this out when I'm at my working computer. :) – Curtain Jul 25 '11 at 13:37