2

Recently I start doing some tests for my python scripts. And for some awkward reason, the module that runs python script and checks its output is written in C with addition of some other languages. This way is more convinient for me to use for now.

The single test runs with the below code:

 FILE *fd = NULL;

 fd = popen("cmd", "r");
 if(NULL == fd){
  fprintf(stderr, "popen: failed\n");
  return 1;
 }
 fprintf(stderr, "res = %d: %s\n", errno, strerror(errno));

 int res = pclose(fd);
 fprintf(stderr, "res = %d: %s\n", res, strerror(errno));

As you can see from above, the code just runs a script with the help of popen and checks its exit status. But one day I had run in a situation where popen were given wrong arguments. Something like this had happened:

fd = popen("python@$#!", "r");

And the test module had returned:

res = 0: Success
sh: 1: python@0!: not found
res = 32512: Success

So, popen run happily with the above mistake. And only pclose returned some exit status. With errno being zero. Between all of that, the shell also made its output.

Here is my question. How can I detect if a shell failed to execute a command? The failure could be for any reason actually, but the main point is that the script does not event started.

user14063792468
  • 839
  • 12
  • 28
  • Note that no standard C or POSIX library function ever sets `errno` to zero. Printing an error message based on `errno` when `fd` is not NULL is not appropriate; the error number is not from `popen()` (or is not set because `popen()` failed). Printing `res` after `pclose()` is OK; adding `strerror(errno)` runs into the same problem (the information in `errno` may be entirely irrelevant). You can set `errno` to zero before calling a function. If the function returns a failure indication, it may be relevant to look at `errno`. _[…continued…]_ – Jonathan Leffler Feb 01 '20 at 00:47
  • _[…continuation…]_ However, `errno` can be set non-zero by a function even if it succeeds. Solaris standard I/O used to set `errno = ENOTTY` if the output stream was not connected to a terminal, even though the operation succeeded; it probably still does. – Jonathan Leffler Feb 01 '20 at 00:47
  • @JonathanLeffler The man for `popen` and `pclose` does say that the functions set `errno` on failure. Not any failure but some. I do not run any functions in between or right after `popen` or `pclose`, then why should `errno` be irrelevant? – user14063792468 Feb 01 '20 at 00:54
  • Yes — but they may also set `errno` to a non-zero value even when they succeed. You can't interpret `errno` meaningfully unless the function first reports failure through its exit status. – Jonathan Leffler Feb 01 '20 at 00:56
  • @JonathanLeffler Also I run `linux`, tag updated. – user14063792468 Feb 01 '20 at 00:56
  • @JonathanLeffler That is the case. The exit status of both functions is OK. But the shell failed. What should I do? That was the question. – user14063792468 Feb 01 '20 at 00:58
  • don't understand your output. You have `res =` in your printf strings, but not in your output. Is the output you show actually the output you are getting, or have you made a substantive copy/paste error? – William Pursell Feb 01 '20 at 01:11
  • @WilliamPursell I made an edit. This is, and for some degree was, an exact output. – user14063792468 Feb 01 '20 at 01:16
  • @JonathanLeffler *Solaris standard I/O used to set` errno = ENOTTY` if the output stream was not connected to a terminal, even though the operation succeeded; it probably still does.* That's perfectly legal [per 7.5p3](https://port70.net/~nsz/c/c11/n1570.html#7.5p3): "The value of errno may be set to nonzero by a library function call whether or not there is an error, provided the use of errno is not documented in the description of the function in this International Standard." Quite a few of the stdio functions aren't documented to set `errno`, such as `printf()` and `fwrite()`. – Andrew Henle Feb 01 '20 at 01:26
  • @AndrewHenle — yes, I was using as an example of "it is legal for a function to set `errno` even if the function itself succeeds". – Jonathan Leffler Feb 01 '20 at 01:28

1 Answers1

3

General comments about when to use errno

No standard C or POSIX library function ever sets errno to zero. Printing an error message based on errno when fd is not NULL is not appropriate; the error number is not from popen() (or is not set because popen() failed). Printing res after pclose() is OK; adding strerror(errno) runs into the same problem (the information in errno may be entirely irrelevant). You can set errno to zero before calling a function. If the function returns a failure indication, it may be relevant to look at errno (look at the specification of the function — is it defined to set errno on failure?). However, errno can be set non-zero by a function even if it succeeds. Solaris standard I/O used to set errno = ENOTTY if the output stream was not connected to a terminal, even though the operation succeeded; it probably still does. And Solaris setting errno even on success is perfectly legitimate; it is only legitimate to look at errno if (1) the function reports failure and (2) the function is documented to set errno (by POSIX or by the system manual).

See C11 §7.5 Errors <errno.h> ¶3:

The value of errno in the initial thread is zero at program startup (the initial value of errno in other threads is an indeterminate value), but is never set to zero by any library function.202) The value of errno may be set to nonzero by a library function call whether or not there is an error, provided the use of errno is not documented in the description of the function in this International Standard.

202) Thus, a program that uses errno for error checking should set it to zero before a library function call, then inspect it before a subsequent library function call. Of course, a library function can save the value of errno on entry and then set it to zero, as long as the original value is restored if errno's value is still zero just before the return.

POSIX is similar (errno):

Many functions provide an error number in errno, which has type int and is defined in <errno.h>. The value of errno shall be defined only after a call to a function for which it is explicitly stated to be set and until it is changed by the next function call or if the application assigns it a value. The value of errno should only be examined when it is indicated to be valid by a function's return value. Applications shall obtain the definition of errno by the inclusion of <errno.h>. No function in this volume of POSIX.1-2017 shall set errno to 0. The setting of errno after a successful call to a function is unspecified unless the description of that function specifies that errno shall not be modified.

popen() and pclose()

The POSIX specification for popen() is not dreadfully helpful. There's only one circumstance under which popen() 'must fail'; everything else is 'may fail'.

However, the details for pclose() are much more helpful, including:

If the command language interpreter cannot be executed, the child termination status returned by pclose() shall be as if the command language interpreter terminated using exit(127) or _exit(127).

and

Upon successful return, pclose() shall return the termination status of the command language interpreter. Otherwise, pclose() shall return -1 and set errno to indicate the error.

That means that pclose() returns the value it received from waitpid() — the exit status from the command that was invoked. Note that it must use waitpid() (or an equivalently selective function — hunt for wait3() and wait4() on BSD systems); it is not authorized to wait for any other child processes than the one created by popen() for this file stream. There are prescriptions about pclose() must be sure that the child has exited, even if some other function waited on the dead child in the interim and thereby caused the system to lose the status for the child created by popen().

If you interpret decimal 32512 as hexadecimal, you get 0x7F00. And if you used the WIFEXITED and WEXITSTATUS macros from <sys/wait.h> on that, you'd find that the exit status is 127 (because 0x7F is 127 decimal, and the exit status is encoded in the high-order bits of the status returned by waitpid().

int res = pclose(fd);

if (WIFEXITED(res))
    printf("Command exited with status %d (0x%.4X)\n", WEXITSTATUS(res), res);
else if (WIFSIGNALED(res))
    printf("Command exited from signal %d (0x%.4X)\n", WTERMSIG(res), res);
else
    printf("Command exited with unrecognized status 0x%.4X\n", res);

And remember that 0 is the exit status indicating success; anything else normally indicates an error of some sort. You can further analyze the exit status to look for 127 or relayed signals, etc. It's unlikely you'd get a 'signalled' status, or an unrecognized status.

popen() told you that the child failed.

Of course, it is possible that the executed command actually exited itself with status 127; that's unavoidably confusing, and the only way around it is to avoid exit statuses in the range 126 to 128 + 'maximum signal number' (which might mean 126 .. 191 if there are 63 recognized signals). The value 126 is also used by POSIX to report when the interpreter specified in a shebang (#!/usr/bin/interpreter) is missing (as opposed to the program to be executed not being available). Whether that's returned by pclose() is a separate discussion. And the signal reporting is done by the shell because there's no (easy) way to report that a child died from a signal otherwise.

Community
  • 1
  • 1
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • My point about `errno` is in that, that the above code is almost all that is in program. I've extracted bare bones. So the above code is full program except includes and function declaration. My intuition tells me that `errno` should be zero(or some meaningful value) upon program entry. And this is the case. – user14063792468 Feb 01 '20 at 01:32
  • ...So if _any_ error would show up, it could be _only_ from invocation of `popen` or `pclose`. Anyway great thanks for the answer. I dig up the docs for now. – user14063792468 Feb 01 '20 at 01:34