1

I am working with a simulation that consists of a cells, objects that interact with each other and evolve over time. I am looking at parallelizing my code using the OpenMP parallel for imperative and want my cells (objects) to be updated per timestep in parallel, below you can see part of the code.

#pragma omp parallel for default(none) shared(cells, maxCellReach, parameters, cellCellJunctions, timestepDistance) firstprivate(timestep), reduction(+ : centerOfMass)
    for (auto it = cells.begin(); it != cells.end(); it++) {
        maxCellReach = std::max(it->getNucleus()->getCellReach(), maxCellReach);
        it->ApplyIntercellularForce(cellCellJunctions, Constants::junctionCreateDistance, parameters);
        it->step(timestep, parameters);
        centerOfMass += it->GetCenterOfMass();
    }

I have read that the loop iterator variable of a openMP parallel for loop is always private. I however, want all the cells to be updated together. Does this code achieve that or do I need a different approach with openMP?

I looked around online, but could not find anything on making loop iterators shared variables or something similar in a for loop.

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
Jonathan
  • 11
  • 1
  • 1
    You need a reduction for `maxCellReach` (otherwise there is a race condition). Note that the iterator needs to be a random iterator. Note that Input/Forward/Bidirectional iterators cannot be used in such a parallel loop (and need more advanced and more expensive constructs). – Jérôme Richard Nov 24 '22 at 22:10
  • @JérômeRichard, I included the reduction for `maxCellReach`. Would you also know how to apply a reduction to a object that's part of `cell`? Each cell consists of 100 `beads`, which have interactions and their forces are updated by their own interactions and interactions with beads of other cells (using `force += someValue`). The datastructure is: `std::vector> beads` where beads has a private force vector. – Jonathan Nov 25 '22 at 10:38
  • OpenMP provide custom user-defined operators but they are quite basic. It may fit your needs. That being said, you certainly need to implement some additional functions in your class for that. OOP often do not mix well with parallelism due to the encapsulation principle: thread-safety is quite dependent of the implementation which have to be hidden. The split in objects also make many parallel optimizations harder (eg. ones based on kind of transpositions or ones requiring pre-computations). – Jérôme Richard Nov 25 '22 at 12:18
  • Note that it is not safe to update the properties of an object while other threads read it as it would cause a race condition. If you want to do that, you certainly need a kind of double-buffering approach (ie. save the properties, perform the parallel reduction and only then update the properties in parallel). Note that `shared_ptr` are generally not great in term of performance, especially in a parallel loop as they tends to cause atomic accesses. – Jérôme Richard Nov 25 '22 at 12:20
  • @JérômeRichard Thanks! I believe I have narrowed down the problem such that it calculates everything for each of the objects first and then update them accordingly after a (implicit) barrier, so the double buffering is kind of implemented. What would you consider better alternatives to `shared_ptr`? I believe that they are used, because different classes need to access the beads (but I am newish to C++). Previous author's state: "The spring class "saves" the two bead at the end points of the spring as two shared pointers of the bead class called bead1 and bead2" – Jonathan Nov 26 '22 at 10:20
  • The best alternative for an efficient code is to handle that manually. `shared_ptr` are meant to automatically destroy an object when it is not referenced anymore, but this is expensive since this is done at the expense of tracking every copy (causing a dereferencing and an atomic operation). This is fine for few object not being copied a lot but not for a lot of objects. The first thing to do is to `std::move` the `shared_ptr` as much as possible. Then, you can replace them with basic pointers if you know when to delete them. – Jérôme Richard Nov 26 '22 at 20:43
  • In many cases, applications have execution pattern enabling developers to write an efficient memory management system. For example, HPC application operating with time step can use a monotonic allocator for objects deleted at every time step. This enable a very fast allocation/free and no need to track these objects. The same thing also applies in game. Data structures can have their own dedicated allocator (eg. graphs with a SLAB allocator). – Jérôme Richard Nov 26 '22 at 20:49

1 Answers1

2

To be able to use OpenMP on a for loop, it has to fulfill the "Canonical Loop Form" as specified by the standard. For example it is described in the OpenMP 4.5 Specification section 2.6 (starting on page 53) and in the OpenMP 5.0 Specification Section 2.9.1 (starting on page 95).

Both standards specify the loop form to be

for (init-expr; test-expr; incr-expr) structured-block

with differing restrictions on init-expr, test-expr and inc-expr. While 4.5 already allows random-access-iterator-type to be used here specifically from C++, it doesn't allow for != in the test-expr. This was fixed by the 5.0 standard which is almost completely implemented by the newest versions of gcc and clang at the time of writing this answer. For compatibility with older versions of these compilers you might have to use < instead if possible. For more information on which versions of which compiler implements which standard, see here.

If your iterators are not allowing random access, things get more complicated. See for example the tasking code in this answer.

On page 99 the 5.0 standard even allows for range-based for loops from C++11, so you could write your loop even more elegantly.

As of OpenMP 5.2 there is still at least one special case (#pragma omp simd) which does not work on non-pointer random access iterators.

paleonix
  • 2,293
  • 1
  • 13
  • 29
  • Interesting answer. Is the `ff` in `53ff` and `95ff` a typo or does it has a special meaning? – Jérôme Richard Nov 25 '22 at 12:07
  • 1
    @JérômeRichard Maybe it's a German thing, but I thought it was used in English as well when citing pages, where "f" means this page and the following one and "ff" mean that it continues over more than two pages. I should probably add section numbers as well. – paleonix Nov 25 '22 at 17:25
  • 1
    You are welcome. I got rid of it so it's easier to read. The proper way of using it would have probably been more like "page 53 f.f.", etc. – paleonix Nov 25 '22 at 17:39