1

I'm doing an assignment on fork(),exec() and related UNIX calls where I need to show the zombie state of a (child) process. Here's the relevant piece of code:

pid = vfork();  //used vfork() for showing z state
if(pid>0)
  {
    (some sorting code)
    execl("/bin/ps","/bin/ps","a",(char*)0);             
  }

What I expect is:

(child's output)
(parent's output)
(Output of the ps command where I then would be able to show a 'defunct' entry)

What I get is:

(child's output)
(parent's output) 
No ps command output. Instead I get: Signal 17 (CHLD) caught by ps (procps version 3.2.8)

However, when sleep(int time) (some integer time in seconds) is inserted before the execl call, I get the desired output and no Signal errors are reported.

What's happening here? Does ps becomes the new parent of the (as yet-zombie) child? And why does the ps command not execute? What does sleep() do that makes ps to execute as required?

I'm new to POSIX/Linux programing so any relevance of this SIGCHLD signal with respect to my particular situation would be appreciated. Thanks!

Linuxios
  • 34,849
  • 13
  • 91
  • 116
Kedar Paranjape
  • 1,822
  • 2
  • 22
  • 33
  • What is "(some sorting code)"? If there is any chance that it might trigger a system call, it might be unsafe or at least undefined. Usually you can only do *trivial* things (e.g. a bunch of calculations that don't allocate memory or call the OS) before calling some `execXX()` function in a child. – Kevin Grant Jul 21 '12 at 15:58
  • (some sorting code) is just code for selection sort of an array of integers. Haven't used any other system calls,no pointers,malloc() either. – Kedar Paranjape Jul 21 '12 at 16:02
  • @KevinGrant: With `vfork()`, true. With normal `fork()`, no. – Linuxios Jul 21 '12 at 16:09
  • It may depend on the UNIX variant but I've definitely seen versions of normal `fork()` with caveats on what children can do. – Kevin Grant Jul 21 '12 at 16:29
  • Note that in general, the only safe things to do in a `vfork()` child are calling `exec*()`, `_exit()`, and/or modifying a variable of type `pid_t` to store the result of `vfork()`. If you want to wait for the child, use `fork()` and one of the `wait*()` family of functions. – ninjalj Jul 21 '12 at 18:50

2 Answers2

0

I might be wrong, but I think what's happening is this:

  • Your child starts and does the sorting code while the parent blocks.
  • The child exits.
  • The parent does it's half of the if, executing ps.
  • After ps is started, SIGCHLD is sent to the parent process because of the termination of the child (signals can be slow and unpredictable)
  • If you add the sleep, SIGCHLD is delivered to the parent, who ignores it, and then control passes to ps.
Linuxios
  • 34,849
  • 13
  • 91
  • 116
  • What does "child is still alive" mean? Isn't it a zombie already? I'm using `vfork()` which causes the parent to wait until the child's state has changed(we could, probably assume 'terminated' as a 'state change' here) So,at the time of execting `execl`, child process _has_ to have already terminated by way of completion. – Kedar Paranjape Jul 21 '12 at 16:23
  • @black_stallion: Hm. I just read up in the man page, and your right. – Linuxios Jul 21 '12 at 16:26
  • 1
    `system("ps")` would be a quick-fix, but I am hoping to understand why `execl` cannot, apparently, be used. – Kedar Paranjape Jul 21 '12 at 16:34
  • @black_stallion: What do you do in the child? What code is in the `else` branch of your `if`? – Linuxios Jul 21 '12 at 16:41
  • Two `printf` statements followed by a sort function call(again,no sys calls used in this function) and an `exit(EXIT_SUCCESS)` before the ending `}` – Kedar Paranjape Jul 21 '12 at 16:49
  • +1 (Can't upvote yet!)Yes, that's a plausible explanation. In that case,it seems that, `ps`, after receiving signals such as `SIGCHLD`, does not itself execute. Perhaps this is implementation-specific..? – Kedar Paranjape Jul 21 '12 at 16:59
  • @black_stallion: Probably it is implementation specific. And if the answer helped, you can always accept the answer by clicking the check mark bellow the vote count. I'd upvote the question again if I could, great question. – Linuxios Jul 21 '12 at 17:02
  • @black_stallion: Sure. Just keep asking and answering, and you can get very far very quick. Any user who puts this much into a first question is a great member of this community. – Linuxios Jul 21 '12 at 17:05
  • Yep;) `SO` is simply great! Just wanted my question to help and be helped! – Kedar Paranjape Jul 21 '12 at 17:08
  • @black_stallion: Great job with that. It's really nice to come across a well formatted, specific, answerable question. I'd give it 20 +1s if I could. – Linuxios Jul 21 '12 at 17:09
  • Hehe! A Problem Well-stated is Half-solved! Now, where did I get _that_?! – Kedar Paranjape Jul 21 '12 at 17:12
  • @black_stallion: Ooh! Shiny quote! – Linuxios Jul 21 '12 at 17:15
-1

Title

ps -ef fails with "Signal 17 (CHLD) caught by ps (procps version 3.2.8)"" on Redhat 6.6

Description

When running a ps -ef command on Redhat 6.6 it fails with the following error: "Signal 17 (CHLD) caught by ps (procps version 3.2.8)"

Cause

This is a 3rd Party issue. Redhat have created the following article to track the issue:

https://access.redhat.com/solutions/1235753

Resolution

Please refer to the Redhat article for the latest workarounds. https://access.redhat.com/solutions/1235753 These include the renaming of the libfreebl3.chk files as follows:

# mv /lib/libfreebl3.chk /lib/libfreebl3.chk-bz1153759
# mv /lib64/libfreebl3.chk /lib64/libfreebl3.chk-bz1153759

Additional Information

This appears to have been fixed by RedHat now. See RHBA-2014:1867

Cthulhu
  • 5,095
  • 7
  • 46
  • 58