8

According to the man page (2) the exit function is not thread safe : MT-Unsafe race:exit, this is because this function tries to clean up resources (flush data to the disk, close file descriptors, etc...) by calling callbacks registered using on_exit and atexit. And I want my program to do that ! (one of my thread keeps a fd open during the whole program's lifespan so _exit is not an option for me because I want all the data to be written to the output file)

My question is the following : if I'm being careful and I don't share any sensible data (like a fd) between my threads, is it "acceptable" to call exit in a multi-threaded program ? Note that I'm only calling exit if an unrecoverable error occurs. Yet, I can't afford having a segfault while the program tries to exit. The thing is, an unrecoverable error can happen from any thread...

I was thinking about using setjmp/longjmp to kill my threads "nicely" but this would be quite complex to do and would require many changes everywhere in my code.

Any suggestions would be greatly appreciated. Thanks ! :)

EDIT : Thanks to @Ctx enlightenment, I came up with the following idea :

#define EXIT(status) do { pthread_mutex_lock(&exit_mutex); exit(status); } while(0)

Of course the exit_mutex must be global (extern).

curiousguy
  • 8,038
  • 2
  • 40
  • 58
ShellCode
  • 1,072
  • 8
  • 17
  • If the program needs to be able to cleanly shut down in any case design it like this. Reading your question I have the impression you did not thought about that until just now facing what you call an "*Unrecoverable Error*", which, to be honest, I doubt there were. And if it were, you probably had other problems then bringing down your program cleanly ...;) – alk Jul 23 '19 at 10:59
  • 2
    *one of my thread keeps a fd open during the whole program's lifespan so `_exit` is not an option for me because I want all the data to be written to the output file* Integer-type file *descriptors* do not buffer data in the process's address space. Any `write()` (or similar) call that's completed when you call `_exit()` will be in the file it was written to (barring things like power failures that crash the system before the page cache is flushed to disk...) `exit()` only causes problems with `FILE *` based streams that buffer data. – Andrew Henle Jul 23 '19 at 11:24
  • @alk I have many *legit* unrecoverable errors that can happen. However I've just realized now that exit() is not thread-safe. @AndrewHenle Good to know ! Thanks ! But I still think `_exit` is kind of gross anyway... Better clean what's cleanable :) – ShellCode Jul 23 '19 at 11:34
  • 1
    @ShellCode *Better clean what's cleanable* Even better is to not make any messes that need cleaning in the first place. Ideally, any process should be able to take a `SIGKILL` or a power failure at any moment and be perfectly happy when it's restarted. You're spending how much time trying to make your cleanup robust? And even if you do that, you don't protect against a power failure or incorrectly aimed `SIGKILL`. – Andrew Henle Jul 23 '19 at 11:59
  • @AndrewHenle Of course I can't predict everything but I try to make it as robust as possible... TBH, my main concern is to exit my program without receiving any SIGSEGV (that's why I'm making sure exit will not mess with my thread safety), now you are probably tempted to mention `_exit` again, but the thing is I don't know every underlying Linux concepts and calling `exit` safely instead of the brutal `_exit` is a way for me to know I'm not making anything stupid and that resources will be cleaned up. Don't know if I'm making sense rn... :sweat_smile: – ShellCode Jul 23 '19 at 12:12
  • Oh and btw ! You said **exit() only causes problems with FILE * based streams that buffer data** did you mean `_exit` ? Because I'm using FILE* everywhere :x – ShellCode Jul 23 '19 at 12:17
  • `exit()` will flush all your open `FILE *` streams. That can be a problem if, for example, `FILE *` structures have internal locks of any kind and one thread is holding that lock for some reason. Then the call to `exit()` will hang. (This is one reason why `exit()` is not async-signal-safe and can't be safely called from within a signal handler.) Can any of your uses of a `FILE *` hold such a lock indefinitely? Perhaps by blocking for input on a read/write `FILE *` stream? – Andrew Henle Jul 23 '19 at 12:47
  • Well... If I'm not mistaken, fread/fwrite calls are blocking (there's no `O_NONBLOCK` flag when using fopen) but I don't see why it would prevent an `exit()` from finishing... But tbh I don't really know how the FILE struct works internally, I might be wrong here. – ShellCode Jul 23 '19 at 12:59
  • 1
    Can you give one good example of what you mean by an "unrecoverable error"? I'm having a very hard time imagining what your actual use case is. – David Schwartz Jul 23 '19 at 22:34
  • @ShellCode Since `fread` must return data (fill the buffer) until EOF is reached, of course it will block until data is available. There is no way for this interface to even express "interrupted, please retry now" not "nothing for now, retry a bit later". Nor is there any way for an stdio write function to do that, but as long as the user buffer is copied elsewhere, and reusable by the user, the function doesn't need to block. Reading!=writing in this regard. **Completed stdio write does not mean the data is on safe/stable storage**. – curiousguy Jul 23 '19 at 22:36
  • @AndrewHenle What if `exit()` destroys the stdio data structures? – curiousguy Jul 23 '19 at 22:38
  • "_close file descriptors_" You don't need to; it's implicit in the destruction of a process (std Unix semantics). – curiousguy Jul 23 '19 at 22:39
  • @DavidSchwartz : I'm writting encrypted stuff on the disk, if an openssl error occurs, what do you want me to do ? Write it unencrypted ? No way. Don't worry, my **unrecoverable errors** are real ;) @curiousguy So... If Linux guarantees that, would calling `_exit` be ok ? – ShellCode Jul 24 '19 at 09:02
  • @ShellCode I still don't understand. You say, "an openssl error". What does that mean exactly? *What* SSL error exactly? Obviously, if you encounter an error trying to encrypt data, you don't write unencrypted data to disk. That would seem a good reason to do a clean shutdown of your program, not calling `exit`. – David Schwartz Jul 24 '19 at 16:05
  • OpenSSL error when trying to encrypt data using AES or trying to encrypt the AES key using RSA. That's not the only **unrecoverable** error I've got, but that's probably the most obvious one. Well... Actually `exit` is pretty clean... `_exit` isn't however. The only problem with `exit` is its thread safety. – ShellCode Jul 25 '19 at 08:58
  • Related: https://stackoverflow.com/q/55063381/694576 – alk Jul 30 '19 at 05:25

3 Answers3

4

The manpage states that

The exit() function uses a global variable that is not protected, so it is not thread-safe.

so it won't help, if you are being careful in any way.

But the problem documented is a race condition: MT-Unsafe race:exit

So if you make sure, that exit() can never be called concurrently from two threads, you should be on the safe side! You can make this sure by using a mutex for example.

Ctx
  • 18,090
  • 24
  • 36
  • 51
  • Thanks for you answer ! I'm afraid making sure `exit()` is never called concurrently can be quite tedious... Unless... What do you think about that ? `#define EXIT(status) do { pthread_mutex_lock(&exit_mutex); exit(status); } while(0)` – ShellCode Jul 23 '19 at 10:29
  • @ShellCode Yes, of course, just use a mutex. I added that to the answer. – Ctx Jul 23 '19 at 10:34
  • 1
    That looks weird however. Why is `exit` not doing that natively ? There must be a reason... The program will stop anyway if you're calling exit, so why not making it thread safe ? I don't get it... – ShellCode Jul 23 '19 at 10:38
  • 2
    Maybe the reason is, that the standard programming pattern is: start process -> start n threads -> join n threads -> call exit (in the main thread). No need for locks here. – Ctx Jul 23 '19 at 10:55
  • I've gave it some thoughts, and I think the answer to that is just that not all programs are multi-threaded, so it wouldn't make sense to lock a mutex natively in the exit function – ShellCode Jul 23 '19 at 11:27
  • @ShellCode instead of `#define EXIT(status) do {... ` you can transform all `exit`s into `myexit` and implement what ever needs to be done in `myexit`. – Jabberwocky Jul 23 '19 at 11:37
  • Yeah well isn't it kind of the same idea ? Just that instead of wrapping the function at runtime, I do it during the preprocessing – ShellCode Jul 23 '19 at 11:39
  • It can't be made safe and reliable. You have to stop all your threads. – curiousguy Jul 23 '19 at 11:51
1

A modern cross-platform C++ solution could be:

#include <cstdlib>
#include <mutex>

std::mutex exit_mutex;

[[noreturn]] void exit_thread_safe(const int status)
{
    exit_mutex.lock();
    exit(status);
}

The mutex ensures that exit is never called by 2 (or more) different threads.

However, I still question the reason behind even caring about this. How likely is a multi-threaded call to exit() and which bad things can even realistically happen?

EDIT:
Using std::quick_exit avoids the clang diagnostic warning.

BullyWiiPlaza
  • 17,329
  • 10
  • 113
  • 185
0

It can't be done: even if no data is shared between threads at first, data must be shared between a thread and its cleanup function. The function should run only after the thread has stopped or reached a safe point.

curiousguy
  • 8,038
  • 2
  • 40
  • 58
  • Ok but what if you mutex lock the call to exit ? There's not concurrent call to exit, and therefore cleanup thread functions will be sequentially called and the thread safety isn't threatened. At least that's what I think and that's also what @Ctx seems to think... Can you please elaborate ? – ShellCode Jul 23 '19 at 11:59
  • The thread safety of what? Why *specifically* do you need to call `exit` and not `_exit`? – curiousguy Jul 23 '19 at 13:32
  • I've got a FILE* open during the whole program's lifespan. If I use `_exit`, buffered data will not be written to the disk. And there are probably other reasons I don't even notice (I don't know by heart the implem of every function in the stdio). But calling `exit` instead of `_exit` seems safer, and in my context, I really need safety – ShellCode Jul 23 '19 at 13:36
  • 2
    The point is that you are doing cleanup in one thread while some other is using that resource – curiousguy Jul 23 '19 at 14:24
  • Ok I see what you mean... You're right indeed, but that should be safe as long as I don't share anything between my threads, right ? (I don't have any custom cleanup functions, I'm just assuming the stdio has its own) – ShellCode Jul 23 '19 at 14:41
  • 1
    You seem to be missing the point: 1) there is sharing between the thread using an stdio global object and the one calling `exit`; 2) clean exit means running destructors of global objects. You can't do that w/o stopping all threads that can use these object objects. – curiousguy Jul 23 '19 at 22:28
  • What kind of stdio global object are you talking about ? I don't see any... What do you suggest then ? What about instead of calling exit() in any of my thread, I send a SIGUSR1 to the main thread, kill all the threads, and then call exit ? – ShellCode Jul 24 '19 at 08:46
  • Can you show some real world code? You have to use a global somewhere. – curiousguy Jul 24 '19 at 16:25
  • I use some global variables yeah, but only mutexes and pthread conditions basically... At first sight, nothing that need clean up when exit() is called. And no I can't show any code, sorry :/ Thanks for helping me figuring it out, really appreciated ! – ShellCode Jul 25 '19 at 08:54
  • You can't have just a mutex and CV, ever. You need some normal variable with state! – curiousguy Jul 25 '19 at 15:52
  • I still don't understand where you're going with this, there are no destructors in C so no code is executed during `exit()` unless you explicitly registered a clean up callback using `atexit()`. The only global variables I have are as I said mutexes, CVs and atomic booleans, no funky stuff that would need clean up when `exit()` is called. So I don't see how this could be unsafe as long as I don't make concurrent calls to `exit()` which I prevent with a mutex (in the macro you can see in the edit of my post – ShellCode Jul 25 '19 at 21:25
  • The point of using `exit` is that you need some `atexit` stuff, correct? Some global state is needed to use `atexit` to do any cleanup. Just forget `exit`, its MT safety, and also forget `atexit`. Just simplify the code with a cleanup function. How can cleanup be MT safe? If other threads are still using the resources, it probably can't. – curiousguy Jul 25 '19 at 22:10
  • No I don't have any `atexit()`. The point of using exit is to be able to exit my program from any thread. I don't have any clean up functions (and I don't need any), that's what I'm trying to tell you from the beginning ^^ Some functions however register cleanup callbacks for me however (fopen does that I think) – ShellCode Jul 26 '19 at 08:01
  • @ShellCode Please explain how "I call `atexit` in user code" and "I have library code that might use `atexit`" is different in any significant way. Except in the later case, you can't even restrict what kind of stuff is done, and you can't analyse to know whether it's MT safe. – curiousguy Jul 26 '19 at 18:48
  • @ShellCode "_Some functions however register cleanup callbacks for me however (fopen does that I think)_" Do you imagine that `fopen` would *not* use some kind of global variable? Do you think it doesn't need to be protected by a mutex? – curiousguy Jul 27 '19 at 03:42