0

I need to speed-up some calculation and result of calculation then used to draw OpenGL model. Major speed-up archived when I changed std::vector to Concurrency::concurrent_vector and used parallel_for instead of just for loops. This vector (or concurrent_vector) calculated in for (or parallel_for) loop and contains vertices for OpenGL to visualize.

It is fine using std::vector because OpenGL rendering procedure relies on the fact that std::vector keeps it's items in sequence which is not a case with concurrent_vector. Code runs something like this:

glVertexPointer(3, GL_FLOAT, 0, &vectorWithVerticesData[0]);

To generate concurrent_vector and copy it to std::vector is too expensive since there are lot of items.

So, the question is: I'd like to use OpenGL arrays, but also like to use concurrent_vector which is incompatible with OpenGL output.

Any suggestions?

genpfault
  • 51,148
  • 11
  • 85
  • 139
IgorStack
  • 799
  • 2
  • 6
  • 22
  • @ Igor: As I have another problem with `concurrent_vector` I wanted to ask what you mean by saying, that cv doesn't keep its items in sequence? I was under the impression, that `concurrent_vector` has the same / similar memory layout as a vector and just guards against race conditions via e.g. internal mutexes. – MikeMB Jun 05 '14 at 18:07

1 Answers1

0

You're trying to use a data structure that doesn't store its elements contiguously in an API that requires contiguous storage. Well, one of those has to give, and it's not going to be OpenGL. GL isn't going to walk concurrent_vector's data structure (not if you like performance).

So your option is to not use non-sequential objects.

I can only guess at what you're doing (since you didn't provide example code for the generator), so that limits what I can advise. If your parallel_for iterates for a fixed number of times (by "fixed", I mean a value that is known immediately before parallel_for executes. It doesn't change based on how many times you've iterated), then you can just use a regular vector.

Simply size the vector with vector::size. This will value-initialize the elements, which means that every element exists. You can now perform your parallel_for loop, but instead of using push_back or whatever, you simply copy the element directly into its location in the output. I think parallel_for can iterate over the actual vector iterators, but I'm not positive. Either way, it doesn't matter; you won't get any race conditions unless you try to set the same element from different threads.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • Yes, I did that trick with vector pre-allocation on couple of loops. There I could calculate size of vector beforehand and then use operator[] inside parallel_for. For this particular loop it is not a case - geometry model could change and number of vertices can be changed drastically. – IgorStack Mar 30 '12 at 23:11