2

I have implemented a 2D Particle System based on the ideas and concepts outlined in "Bulding an Advanced Particle System" (John van der Burg, Game Developer Magazine, March 2000).

Now I am wondering what performance I should expect from this system. I am currently testing it within the context of my simple (unfinished) SDL/OpenGL platformer, where all particles are updated every frame. Drawing is done as follows

// Bind Texture
glBindTexture(GL_TEXTURE_2D, *texture);
// for all particles
    glBegin(GL_QUADS);
    glTexCoord2d(0,0);  glVertex2f(x,y);
    glTexCoord2d(1,0);  glVertex2f(x+w,y);
    glTexCoord2d(1,1);  glVertex2f(x+w,y+h);
    glTexCoord2d(0,1);  glVertex2f(x,y+h);
    glEnd();   

where one texture is used for all particles.

It runs smoothly up to about 3000 particles. To be honest I was expecting a lot more, particularly since this is meant to be used with more than one system on screen. What number of particles should I expect to be displayed smoothly?

PS: I am relatively new to C++ and OpenGL likewise, so it might well be that I messed up somewhere!?

EDIT Using POINT_SPRITE

glEnable(GL_POINT_SPRITE);
glBindTexture(GL_TEXTURE_2D, *texture);
glTexEnvi(GL_POINT_SPRITE, GL_COORD_REPLACE, GL_TRUE); 

// for all particles
    glBegin(GL_POINTS);
    glPointSize(size);
    glVertex2f(x,y);
    glEnd();

glDisable( GL_POINT_SPRITE );

Can't see any performance difference to using GL_QUADS at all!?

EDIT Using VERTEX_ARRAY

// Setup
glEnable (GL_POINT_SPRITE);                                         
glTexEnvi(GL_POINT_SPRITE, GL_COORD_REPLACE, GL_TRUE);              
glPointSize(20);                                    

// A big array to hold all the points
const int NumPoints = 2000;
Vector2 ArrayOfPoints[NumPoints];
for (int i = 0; i < NumPoints; i++) {
    ArrayOfPoints[i].x = 350 + rand()%201;
    ArrayOfPoints[i].y = 350 + rand()%201;
}

// Rendering
glEnableClientState(GL_VERTEX_ARRAY);     // Enable vertex arrays
glVertexPointer(2, GL_FLOAT, 0, ArrayOfPoints);     // Specify data
glDrawArrays(GL_POINTS, 0, NumPoints);  // ddraw with points, starting from the 0'th point in my array and draw exactly NumPoints

Using VAs made a performance difference to the above. I've then tried VBOs, but don't really see a performance difference there?

Ben
  • 15,938
  • 19
  • 92
  • 138

2 Answers2

4

I can't say how much you can expect from that solution, but there are some ways to improve it.

Firstly, by using glBegin() and glEnd() you are using immediate mode, which is, as far as I know, the slowest way of doing things. Furthermore, it isn't even present in the current OpenGL standard anymore.

For OpenGL 2.1

Point Sprites:

You might want to use point sprites. I implemented a particle system using them and came up with a nice performance (for my knowledge back then, at least). Using point sprites you are doing less OpenGL calls per frame and you send less data to the graphic card (or even have the data stored at the graphic card, not sure about that). A short google search should even give you some implementations of that to look at.

Vertex Arrays:

If using point sprites doesn't help, you should consider using vertex arrays in combination with point sprites (to save a bit of memory). Basically, you have to store the vertex data of the particles in an array. You then enable vertex array support by calling glEnableClientState() with GL_VERTEX_ARRAY as parameter. After that, you call glVertexPointer() (the parameters are explained in the OpenGL documentation) and call glDrawArrays() to draw the particles. This will reduce your OpenGL calls to only a handfull instead of 3000 calls per frame.

For OpenGL 3.3 and above

Instancing:

If you are programming against OpenGL 3.3 or above, you can even consider using instancing to draw your particles, which should speed that up even further. Again, a short google search will let you look at some code about that.

In General:

Using SSE:

In addition, some time might be lost while updating your vertex positions. So, if you want to speed that up, you can take a look at using SSE for updating them. If done correctly, you will gain a lot of performance (at a large amount of particles at least)

Data Layout:

Finally, I recently found a link (divergentcoder.com/programming/aos-soa-explorations-part-1, thanks Ben) about structures of arrays (SoA) and arrays of structures (AoS). They were compared on how they affect the performance with an example of a particle system.

Shelling
  • 429
  • 6
  • 13
  • I have used `glGetString(GL_VERSION)` to determine that I am currently using Version 2.1.2. Could you point me to a *beginner* tutorial that explains how to use *Point Sprites*? I've read a bit about it and it sounds great, but I couldn't find any suitable code yet? Regarding SoA vs Aos, are you refering to http://divergentcoder.com/programming/aos-soa-explorations-part-1/? (someone posted it in an answer to another question I've had earlier :) – Ben Sep 23 '11 at 00:26
  • Ha, yea, that's exactly what I meant :D Ok, I found this site again: http://www.codesampler.com/oglsrc/oglsrc_6.htm, I basically used that code the last time. It doesn't require that much change for you because it still uses the immediate mode. So, after that, if you want to, you can still look into how to combine that with advanced OpenGL technics like VBOs etc – Shelling Sep 23 '11 at 00:41
  • I've tried some code for Point Sprites (see edit in my original question), but I don't see anything on my screen? – Ben Sep 23 '11 at 00:52
  • Using immediate mode rendering is a *huge* speed kill, but it's not always visible at first. If your app is particularly GPU bound, then it won't hit until you start to become CPU bound; in your case, that's probably the 3000 particle mark. Getting rid of this may be difficult, since you're recreating the buffer every frame (unless you can use a single quad particle and play with transforms), but having a single draw call will save you a **ton** of CPU time in your app *and* the driver. – ssube Sep 23 '11 at 00:55
  • @Ben Does glGetError() return any error state? Did you set glPointSize() and glPointParameterf() correctly? – Shelling Sep 23 '11 at 01:06
  • The problem was that I didn't define the point size. I've updated the code in my original question again now. This works, but I didn't notice any performance increase *at all*!? I'm getting the error state 1282, but it gets displayed correctly!? And I didn't use glPointParameterf()!? – Ben Sep 23 '11 at 01:31
  • That's the point peachykeen and mwd mentioned. You can only do a certain amount of OpenGL calls. I actually hoped, that point sprites would help at least a bit, since switching to vertex arrays and such will require a bit more work to do. I will extends my answer in a sec. – Shelling Sep 23 '11 at 02:03
  • Thank you, I think I will try Vertex Arrays next, I'll have a look at the document mwd refered to. I also found a (rather complicated looking) tutorial that seems very relevant on http://en.wikibooks.org/wiki/OpenGL_Programming/Scientific_OpenGL_Tutorial_01 – Ben Sep 23 '11 at 02:30
  • I have found examples of how to use VAs and VBOs on http://www.gamedev.net/topic/422214-attractors---big-number-of-points-problem/. The VA example works, but for the VBO example, I get the following errors: `'glGenBuffersARB', 'glBindBufferARB', 'glBufferDataARB' was not declared in this scope|`. I am using SDL an so far I only included `#include "SDL_opengl.h"`. What else do I need? – Ben Sep 23 '11 at 13:09
  • I've never worked with SDL, so I can't help you with that, sorry. However, I read something about putting #define GL_GLEXT_PROTOTYPES before including SDL. Maybe you need to update your OpenGL library as well, if you get linker errors (unresolved symbols) – Shelling Sep 23 '11 at 13:19
  • Hey again, I've got both the VA and the VBO version working. Unfortunately, I don't really see a performance difference!? What else can I do to improve the performance? – Ben Sep 28 '11 at 05:57
1

Consider using vertex arrays instead of immediate mode (glBegin/End): http://www.songho.ca/opengl/gl_vertexarray.html

If you are willing to get into shaders, you could also search for "vertex shader" and consider using that approach for your project.

mwd
  • 420
  • 3
  • 12