2

I have a library which registers an atfork handler (via pthread_atfork()) which does not support multiple threads when fork() is called. In my case, I don't need the forked environment to be usable because all I want is to call exec() right after the fork(). So, I want the fork() but without any atfork handlers. Is that possible? Do I miss any important edge cases?

For background info, the library is OpenBlas, the issue is described here and here.

Albert
  • 65,406
  • 61
  • 242
  • 386
  • If OpenBlas registers an atfork handler that *in general* does not work reliably in multi-threaded programs, then OpenBlas is buggy in that regard. `pthread_atfork()` has no other intended use case than that one. If this is indeed an issue with OpenBlas, and you're not prepared to either fix it yourself or wait for the maintainers to fix it, then your best option may be to choose a different BLAS library. – John Bollinger Oct 18 '17 at 14:35

2 Answers2

3

You could use vfork() (NPTL implementation doesn't call fork handlers). Although POSIX has removed vfork from the standard, it's likely available on your implementation.

Fork handlers established using pthread_atfork(3) are not called when a multithreaded program employing the NPTL threading library calls vfork(). Fork handlers are called in this case in a program using the LinuxThreads threading library. (See pthreads(7) for a description of Linux threading libraries.)

Or, posix_spawn(). This is similar to vfork. Man page says:

According to POSIX, it unspecified whether fork handlers established with pthread_atfork(3) are called when posix_spawn() is invoked. On glibc, fork handlers are called only if the child is created using fork(2).

Or, syscall and directly use SYS_clone. SYS_clone is the system call number used to create threads and processes on Linux. So syscall(SYS_clone, SIGCHLD, 0); should work, provided you would exec immediately.

syscall(SYS_fork); (as answered by Shachar) would likely work too. But note that SYS_fork not available on some platforms (e.g., aarch64, ia64). SYS_fork is considered as obsolete in Linux and it's only there for backward compatibility and Linux kernel uses SYS_clone for creating all "types" of processes.

(Note: These options are mostly limited to glibc/Linux).

P.P
  • 117,907
  • 20
  • 175
  • 238
1

Yes. The following should work on Linux (and, I think, all glibc based platforms):

#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>

...
  syscall(SYS_fork);

This bypasses the library and directly calls the system call for fork. You might run into trouble if your platform does not implement fork as a single system call. For Linux, that simply means that you should use clone instead.

With that in mind, I'm not sure I'd recomment doing that. Since you're a library, you have no idea why someone registered an atfork. Assuming it's irrelevant is bad programming practice.

So you lose portability in order to do something that may or may not break stuff, all in the name of, what? Saving a few function calls? Personally, I'd just use fork.

Shachar Shemesh
  • 8,193
  • 6
  • 25
  • 57
  • Note that this approach, if available to you in your implementation, is an alternative to calling `fork()` directly. It does not address the case of calling `fork()` indirectly, such as via `system()` or `popen()`. – John Bollinger Oct 18 '17 at 14:49
  • I guess there is no way to get the code be more platform independent, so e.g. for MacOSX, I would need other code? Another question: If I intend to call `exec()` right after the `fork()`, what possible reasons could there be to run any atfork-handlers? – Albert Oct 18 '17 at 15:15
  • @Albert But your C library's fork doesn't know you are going to exec immediately ;-) Even if it did, it still can't skip fork handlers as it's required to run them per documentation. Perhaps, `posix_spawn` is your best bet if you are looking for portability. – P.P Oct 18 '17 at 15:28
  • @usr: Of course it doesn't know. But e.g. `vfork()` seems to do that, like a `fork()` but without calling the atfork handlers. My question was whether there is any case at all, when I want to call `exec()` afterwards, that I call the atfork handlers in that case. – Albert Oct 18 '17 at 15:37
  • The most common use case for `pthread_atfork` to keep resources (mutexes, condtional vars etc) in a multi-threaded program in a consistent state in the child process after fork. If you immediately exec, there's no need to worry about any such inconsistencies - you have totally replaced it with a new process image. – P.P Oct 18 '17 at 15:55