2

I am making a C program that uses fork and execv to run other programs in parallel.

I can't seem to be able to time the duration of the execution of the program called by execv since the new process dies immediately after that program is done running. Another complication is not being able to use the parent process to wait for the child process to finish (I am using waitpid) because I need the parent process to do some other work instead of waiting for the child process to be finished.

So my question is: is there a way to measure the duration of the execv call without the use of an auxiliary fork, pthread or text file?

Thank you in advance

Saucy Goat
  • 1,587
  • 1
  • 11
  • 32
  • You could take a time stamp at the beginning of `main`, and register a handler with `atexit`. The handler can take the second time stamp, and compute the difference. Of course, if the code always reaches the end of `main`, then you can just take the second time stamp there. – user3386109 Sep 29 '18 at 20:45
  • @user3386109 Thank you for the reply. Now, I don't see that working; if I created those time stamps on the program being executed, the parent process wouldn't have access to them anyway. Or did I miss something? – Saucy Goat Sep 29 '18 at 20:59
  • 1
    If you can't modify the program that you invoke to do the timing (and you probably shouldn't even if you could), then you are stuck without using some auxilliary fork/exec operation. The reliable ways all involve a parent process managing the timing while the child executes. If your immediate parent process can't dedicate itself to timing, then you'll need to create another process that can dedicate itself to timing, which means you'll need an extra fork/exec cycle. That's the long and the short of it — no can do without extra process (fork/exec). – Jonathan Leffler Sep 29 '18 at 21:05
  • 1
    Hmm, are you using the returned status for anything? The status returned by `waitpid` could be used to communicate the time difference. Of course, you need to be careful with the encoding, since you only have 8 bits to play with. – user3386109 Sep 29 '18 at 21:05
  • @JonathanLeffler Thank you for the reply. I guess not all StackOverflow posts have happy endings :) Cheers! – Saucy Goat Sep 29 '18 at 21:10
  • @user3386109 I am currently. But I may figure out a way not to need it. As to that last part; well, I'm a C noob. Could you tell me how I'd go about using the return status and that encoding you mentioned or point me towards somewhere I could read about it? – Saucy Goat Sep 29 '18 at 21:14
  • 1
    The simplest approach is to execute a variant on `/usr/bin/time your-command arg1 arg2 …` — but the `time` command will itself fork and exec the command for you. You can write your own code to do the job, maybe reporting time in microseconds or even nanoseconds (though the accuracy of such times might be a bit debatable). But you're right, sometimes there isn't a positive answer to questions asked on SO; sometimes, people ask for what's not possible, and then the best that can be said is "sorry; you can't do that as you'd like". – Jonathan Leffler Sep 29 '18 at 21:19
  • 1
    When you call `exit()` with a return code, or simply `return` from `main` with a return code, that value is made available to the parent through the `stat_loc` argument to `waitpid`. So the place to start is the man page for `waitpid`, specifically the part where it discusses the `WEXITSTATUS`. – user3386109 Sep 29 '18 at 21:23
  • 1
    As for encoding the time difference, the best encoding depends how much time you expect the program to take. For example, if the program takes less than a quarter of a second, then the time can just be a number from 0 to 250 in milliseconds. – user3386109 Sep 29 '18 at 21:26
  • @JonathanLeffler I have to thank you again for answering. Though it was not possible to achieve what I thought was possible, your answers were enlightening and very useful. – Saucy Goat Sep 29 '18 at 21:26
  • @user3386109 Okay, so I've got everything down but the part of how to do the actual encoding. Could I store the time in something like an unsigned char? – Saucy Goat Sep 29 '18 at 21:28
  • 1
    Gut feel: using the exit status isn't going to work all that well. On traditional Unix systems, you can only record exit statuses in the range 0..255. It also relies on the process reporting its time consumed in the exit status. It's not completely impossible; I'm just not convinced it is reliable or precise (especially if you want sub-second timing). – Jonathan Leffler Sep 29 '18 at 21:30
  • 1
    I agree with that. Also, your shell really wants to know what the child's status return was; otherwise, you can't report to the user when processes fail. – rici Sep 29 '18 at 21:31
  • @JonathanLeffler and rici you are truly knights in shiny armors when it comes to C. Thank you both for being around. The paradigm has been vanquished, thank you! user3386109 though my original question has been answered, I do appreciate your replies and the idea you presented was very creative, to say the least. I'll give it a go as well after having this work. Cheers! – Saucy Goat Sep 29 '18 at 21:35

1 Answers1

2

Your parent process knows when it issued the fork() system call. That's not exactly the moment that the execv'd process starts running, since the execv() system call takes some amount of time, but it's not totally unreasonable to include that time in the tally. If you accept that limitation, you can just record the start time as the time at which you called fork().

When the child terminates, the parent will receive a SIGCHLD signal. The default action for SIGCHLD is to ignore it, but you probably want to change that anyway. If you attach a signal handler to SIGCHLD, then in that signal handler you can call waitpid (with the WNOHANG option) until you've received all the child terminated notifications. For each notification, you record the notification time as the process's end time. (Again, if the system is under heavy load, the signal might lag from the termination, causing your time measure to be inaccurate. But most of the time, it will be accurate.)

Clearly, the parent needs to track more than one child process. So you'll need to use the child's PID to index these values.

Now you have a start time and an end time for each child process.

There's a small problem, though. You cannot attach the start time to the child process's PID until the fork() call returns to the parent. But it's entirely possible that the fork() call will return to the child, and that the child will call execv() and that the execv()'d process terminates all before the fork() call returned to the parent. (Honest. It happens.)

So it is possible for the SIGCHLD handler to receive a notification of the termination of a process whose start time has not yet been recorded.

This is easy to fix, but when you do so you need to take into account the fact that signal handlers cannot allocate memory. So if you're recording the start and end time information in dynamically allocated storage, you need to have allocated storage before the signal handler runs.

So the code will look something like this:

1. Allocate storage for a new process times table entry
   (PID / start time / end time / status result). Set all
   fields to 0 to indicate that the entry is available.
2. Recall the current time as start_time (a local variable,
   not the table entry).
3. Fork()
4. (Still in the parent). Using an atomic compare-and-swap
   (or equivalent), set the PID of the table entry created
   in step 1 to the child's PID. If the entry was 0 (and is
   now the PID) or if the entry was already the PID, then
   continue to step 6.
5. If the entry has some other non-zero PID, find an empty entry
   in the table and return to step 4.
6. Now record the start time in the table entry. If the table entry
   already has an end time recorded, then the signal handler already
   ran and you know how long it took and what its return status is.
   (This is the case where the child terminated before you got to
   step 4.) You can now report this information.

In the SIGCHLD signal handler, you need to do something like this:

For each successful call to waitpid():
1. Find the entry in the child process information table whose PID
   corresponds to the PID returned by waitpid(). If you find one,
   skip to step 4.
2. Find an empty entry in the child process information table.
   Note that the signal handler cannot be interrupted by the main
   program, so locking is not required here.
3. Claim that entry by setting its PID field to the PID returned by
   waitpid() above.
4. Now that you have an entry, record the end time and return status
   information in the table entry. If the table entry existed
   previously, you need to put the entry on a notification queue
   so that the main process can notify the user. (You cannot call
   printf in a signal handler either.) If the table entry didn't
   exist before, then the main process will notice by itself.

You might have to draw some diagrams to convince yourself that the above algorithm is correct and has no race conditions. Good luck.

Also, if you haven't done any of these things before, you'll want to do some reading :-)

  • waitpid(). Pay particular attention to the macros used to extract status information.

  • sigaction(). How to assign a handler function to a signal. If that's still greek to you, start with signal(7) or a relevant chapter in your Unix programming textbook.

  • Race conditions (from Wikipedia)

  • Compare and Swap (on Wikipedia). (Don't use their sample code; it doesn't work. GCC has a built-in extension which implements atomic compare and swap on any architecture which has a way of supporting it. I know that section is marked legacy and you should use the more complicated functions in the next section __atomic, but in this case the defaults are fine. But if you use __atomic_compare_exchange_n, kudos.)

rici
  • 234,347
  • 28
  • 237
  • 341
  • Quite the read. After (I think) understanding your solution, I came up with a solution of my own. If possible I'd like your opinion with this model: Variables: - arrays in which to store PIDs, start and end times, and the status result. All are initialized using mmap in order to be accessible by all processes - global counter for the total amount of processes (using mmap) Child process: record PID and start time on index `process_counter-1`, increment total process counter Handler: record end time and status on index `process_counter-1` – Saucy Goat Sep 30 '18 at 01:52
  • 1
    @saucy: it's your project, so you should do it the way you want to. I already said how I would do it, which I think is the best way :-) because despite the apparent complexity, it requires no interprocess coordination, nor does it require an additional fork. (If I understand what you are proposing, I don't see how your signal handler can figure out the process index. Also consider that shells are used for a long time -- days, in my case -- so process counter might not be the best id. – rici Sep 30 '18 at 02:35
  • I'll give my little scheme a try and then ask my teacher whether there are flaws in interprocess communication, which there likely are. Also, it'd be the child itself, not the handler, to store the process ID. I'll try and get things to work using my spaghetti plan, and once I have a deeper understanding of what I'm doing I'll definitely take another look at what you posted. Thank you :) – Saucy Goat Sep 30 '18 at 11:06
  • 1
    @saucy: if the child is filling in the data, then using a scoreboard will work although you still need to consider how to deal with long lasting shells. However, there are only two ways to get something to happen at the end of the child execution: modify the child (which doesn't let you run standard utilities) or do an extra fork and execv, which doubles your process count. But that could well still be acceptable within your environment. Have fun with it! – rici Sep 30 '18 at 14:23