I am reading this pseudo code for a barrier synchronization algorithm from this paper, and I could not fully understand it.
The goal of the code is to create a barrier for multiple threads (threads can't pass the barrier unless all threads have completed) using something called "software combining tree" (not sure what it means)
Here is the pseudo code (though I encourage you to look at the article as well)
type node = record
k : integer // fan-in of this node
count : integer // initialized to k
locksense : Boolean // initially false
parent : ^node // pointer to parent node; nil if root
shared nodes : array [0..P-1] of node
// each element of nodes allocated in a different memory module or cache line
processor private sense : Boolean := true
processor private mynode : ^node // my group's leaf in the combining tree
procedure combining_barrier
combining_barrier_aux (mynode) // join the barrier
sense := not sense // for next barrier
procedure combining_barrier_aux (nodepointer : ^node)
with nodepointer^ do
if fetch_and_decrement (&count) = 1 // last one to reach this node
if parent != nil
combining_barrier_aux (parent)
count := k // prepare for next barrier
locksense := not locksense // release waiting processors
repeat until locksense = sense
I understand that it implies building a binary tree but I didn't understand a few things.
- Is P the number of threads?
- What is k? What is "fan-in of this node"
- The article mentions that threads are organized as groups on the leaves of the tree, what groups?
- Is there exactly one node for each thread?
- How do I get "my group's leaf in the combining tree"?