1

I just found that gcc's OpenMP implementation (libgomp) doesn't call pthread_exit(). I need that to use perfsuite (for profiling).

Is there any way to tell GCC to include pthread_exit() at the end of a parallel section of OpenMP while transforming the OpenMP code to pthread codes?

I am using GCC 4.7.0 and Perfsuite 1.1.1 .

Rakib
  • 791
  • 8
  • 19

1 Answers1

4

libgomp implements thread pools. Once created, a thread in the pool remains idle until it is signalled to become member of a thread team. After the team finishes its work, the thread goes into an idle loop until it is signalled again. The pool grows on demand but never shrinks. Threads are only signalled to exit at program finish.

You can read the libgomp code that implements thread pools and teams in the 4.7.x branch here.

Pool threads are terminated like this: libgomp registers a destructor by the name of team_destructor(). It is called whenever the main() function returns, exit(3) gets called or the libgomp library is unloaded by a call to dlclose(3) (if previously loaded with dlopen(3)). The destructor deletes one pthreads key by the name of gomp_thread_destructor, which has an associated destructor function gomp_free_thread() triggered by the deletion. gomp_free_thread() makes all threads in the pool execute gomp_free_pool_helper() as their next task. gomp_free_pool_helper() calls pthread_exit(3) and thus all threads in the pool cease to exist.

Here is the same process in a nice ASCII picture:

main() returns, exit() called or library unloaded
  |
  |
team_destructor() deletes gomp_thread_destructor
  |
  |
gomp_free_thread() called by pthreads on gomp_thread_destructor deletion
  |
  +-------------------------+---------------------------+
  |                         |                           |
gomp_free_pool_helper()   gomp_free_pool_helper() ... gomp_free_pool_helper()
 installed as next task    installed as next task      installed as next task
  |                         |                           |
  |                         |                           |
pthread_exit(NULL)        pthread_exit(NULL)      ... pthread_exit(NULL)

Note that this only happens once at the end of the program execution and not at the end of each parallel region.

Hristo Iliev
  • 72,659
  • 12
  • 135
  • 186
  • Hi Hristo, Thank you for the answer. As far as I understood, when the program exits i.e. the thread gets the signal, it doesn't call pthread_exit(). It seems like I need to change the code for this file (team.c) to do this. Is that right? If it is, is there any simple way to do this? – Rakib Nov 07 '12 at 17:20
  • 1
    Threads run the loop in the second half of `gomp_thread_start()` as long as they have a function to execute after the docking barrier, otherwise they exit the loop and simply return. There is one case when threads call `pthread_exit` and that is the case when `libgomp` was loaded at runtime with `dlopen(3)`. There is one key `gomp_thread_destructor` with an associated destructor `gomp_free_thread()` and a file destructor by the name of `team_destructor()` is registered. When you `dlclose(3)` the library, the file destructor get called and it makes all team threads call `pthread_exit`. – Hristo Iliev Nov 07 '12 at 17:58
  • 1
    Sorry, my mistake - functions with `__attribute__((destructor))` are also called after `main()` returns. It means that pool threads would surely reach the point where they call `pthread_exit` but not before the `main` function has returned. Unfortunately the `libgomp` API is not available to the user and you cannot invoke the pool destructor before the end of `main()`. – Hristo Iliev Nov 07 '12 at 18:06
  • Very nice explanation!! I want to try loading at runtime. But I haven't done that before. If possible, could you please give some suggestion or example code for me to start with? – Rakib Nov 07 '12 at 20:08
  • Read the manual page of `dlopen(3)` or just search the Internet on run-time library loading and symbol resolution. There are plenty of examples. Then you can find a reference on how different OpenMP constructs are implemented using `libgomp` in the [libgomp ABI reference](http://gcc.gnu.org/onlinedocs/libgomp/The-libgomp-ABI.html). The different function prototypes can be found in `libgomp.h` in the GCC source tree. You would need the prototypes in order to do the correct cast of symbol addresses that you obtain via calls to `dlsym(3)`. – Hristo Iliev Nov 07 '12 at 22:24
  • Actually it's possible to shrink thread pool size by forcing `gomp_team_start` to be called with `nthreads` == 2 (1 will not work): https://github.com/gcc-mirror/gcc/blob/releases/gcc-4.7/libgomp/team.c#L340 – Ilya Verbin Jun 01 '22 at 19:31