1

Are there any buffer limitations or specific guidelines with sharing standard error in POSIX / Linux among multiple forked processes?

perror("Some descriptor related error: ");

I have a server application which calls perror when needed. As a single process it works fine. In case of multiple processes created with fork, after running server for a while (printing error many times when it occurs), it starts printing the error statement continuously and goes in an infinite loop.

I verified by commenting out the print statement that server runs normally otherwise.

So it appears to me there might be some buffer overflow kind of scenario for standard error which runs out after some time.

I have not used any mutex or semaphore against perror.

The server code is large and it uses epoll for handling multiple client descriptors with a pool of worker processes which pick up clients as they come.

fkl
  • 5,412
  • 4
  • 28
  • 68

1 Answers1

1

You do need to keep in mind that while I/O may be thread-safe depending on the platform, output to stderr and stdout is not multiprocess-safe ... thus if you have multiple processes writing to the terminal output, they are going to end up writing over each other if there is no inter-process synchronization mechanism to make each write atomic. When I say "atomic", I'm referring to the fact that you want to write out the entire length of each message from each process ... what you may end up with is fragments of each message smashed together from the different processes as each one accesses the terminal buffer and atomically writes a number of bytes, but not the entire number of bytes in each message before having to yield to the next process contending for the terminal buffer resource.

Now, your infinite loop could be caused by a single process ... if you comment out all your error statements, how can you know that the server is working "perfectly"? For instance, if only a single forked process is deadlocked, the rest of the processes may be fine and the server appears to be functioning "normally" when in-fact you've merely masked the bug, not eliminated it.

Jason
  • 31,834
  • 7
  • 59
  • 78
  • Thanks i understand. But can i rely upon the fact that since output otherwise seems perfectly complete and sequential i.e. not overwritten at all until the infinite loop printing perror starts, so this overwriting is not the cause? I mean i should have seen some overlapped printing then? rather than a continuous repeated but complete perror console print? – fkl Jul 24 '12 at 15:49
  • 1
    Although the question was about multi-*processing*, not multi-*threading*, the stdio library is usually thread-safe (the GNU libc implementation certainly is) -- if you call `fprintf` simultaneously from multiple threads, the two messages will be serialized in some order, but each message will be whole and in-tact. The `write(2)` system call is atomic, so simultaneously writing to the same file descriptor from multiple processes is again well-behaved, you just don't know what order the writes will happen in. – Adam Rosenfield Jul 24 '12 at 15:55
  • Even if `stderr` were to have a buffer overflow issue, the memory being used by the output buffer in the terminal process is not shared with the memory of your running process, so you can't corrupt your running process' heap or stack if such a bug with the terminal buffer were to exist ... the repeatedly printed errors are most likely some issue in one of the running forked processes going into an infinite loop. – Jason Jul 24 '12 at 15:55
  • @AdamRosenfield Writes are only atomic up to a certain message length though, correct? – Jason Jul 24 '12 at 15:56
  • @AdamRosenfield Apparently according to this question (http://stackoverflow.com/questions/594851/is-fprintf-thread-safe), calls to `write()` from different processes (i.e., forked processes) are not "safe" – Jason Jul 24 '12 at 16:00
  • Sorry you're right, `write(2)` isn't always atomic. Writes of length up to `PIPE_BUF` are atomic for pipes/fifos, but no guarantees are otherwise made. – Adam Rosenfield Jul 24 '12 at 16:00
  • The first thing that Linux kernel does on `read()`/`write()` syscall is lock the file description, so that if the whole message is passed to `write()` it won't be mixed up with messages from other threads or processes. It's an implementation detail, but I don't see how it can be implemented differently. – Maxim Egorushkin Jul 24 '12 at 16:02
  • @MaximYegorushkin : That doesn't make the process un-interruptible though from the OS's perspective ... the main idea here is that he has multiple forked processes, not multiple threads, thus even though a call to `write()` may be atomic, it can be interrupted. – Jason Jul 24 '12 at 16:05
  • @Jason, if one of the processes goes into infinite loop then it should have stopped picking up and processing any more clients. This i verified is not the case. So i am still not certain what is the cause – fkl Jul 24 '12 at 16:20
  • @fayyazkl : If you do not have a process going into an infinite loop, then who is doing all the printing of the error messages? Have you tried possibly appending a process ID to each error message to see who's doing the endless printing? – Jason Jul 24 '12 at 16:33