Environment: Linux 2.6.32 (RHEL 6.3) on x86_64 with gcc 4.4.6
Background: I am running doing some heavy data crunching: ~500 GB input data spread over ~2000 files. My main process forks N children, each of which receives a list of filenames to crunch.
What I want is for console I/O to pass through the parent. I have been looking into pipe()
and see some fascinating stuff about using poll()
to have my parent block until there are error messages to read. It seems that I need to have N pipes (one per child) and pass poll()
information about what signals I want to listen to. Also, I think that once I dup2(pipe[1], STDOUT)
in each child, each child should be able to write to the pipe with cout << stuff;
as usual, right?
First, is what I have said above about multiple pipes, poll()
ing and dup2()
correct?
Second, how do I set up the parent poll()
loop so that I move on once all the children have died?
Right now, this (incomplete) section of code reads as follows:
int status;
while (1) { // wait for stuff
while ((status = poll(pollfds, ss.max_forks, -1)) > 1)
cout << "fork "<< status << ": " << pipes[status][0];
if (status == -1) Die(errno, "poll error");
if (status == 0) { // check that we still have at least one open fd
bool still_running = false;
for (int i=0; i<ss.max_forks; i++) {
// check pipe i and set still_running if it is not zero
}
if (!still_running)
break;
}
}
Third, what should I set and when should I set it with fcntl()? Do I want to do O_ASYNC? Do I want to do blocking or nonblocking?