2

I'm calling execvp() with a deliberately wrong argument in a fork()'ed child. The errno number is properly set to ENOENT in the child process. I then terminate the child process with _exit(errno);.

My main process calls wait(). When I inspect the returned status with WIFEXITED and WEXITSTATUS I always get EINVAL for the first invocation. All other invocations return the correct ENOENT code.

I cannot explain this behavior. Below is the complete function, which does all of the above things, but a bit more complex.

QVariantMap
System::exec(const QString & prog, const QStringList & args)
{
  pid_t pid = fork();

  if (pid == 0) {
    int cargs_len = args.length() + 2;
    char * cargs[cargs_len];
    cargs[cargs_len - 1] = NULL;

    QByteArrayList as;
    as.push_back(prog.toLocal8Bit());

    std::transform(args.begin(), args.end(), std::back_inserter(as),
        [](const QString & s) { return s.toLocal8Bit(); });

    for (int i = 0; i < as.length(); ++i) {
      cargs[i] = as[i].data();
    }

    execvp(cargs[0], cargs);

    // in case execvp fails, terminate the child process immediately
    qDebug() << "(" << errno << ") " << strerror(errno);  // <----------
    _exit(errno);

  } else if (pid < 0) {
    goto fail;

  } else {

    sigset_t mask;
    sigset_t orig_mask;

    sigemptyset(&mask);
    sigaddset(&mask, SIGCHLD);

    if (sigprocmask(SIG_BLOCK, &mask, &orig_mask) < 0) {
      goto fail;
    }

    struct timespec timeout;
    timeout.tv_sec = 0;
    timeout.tv_nsec = 10 * 1000 * 1000;

    while (true) {
      int ret = sigtimedwait(&mask, NULL, &timeout);

      if (ret < 0) {
        if (errno == EAGAIN) {
          // timeout
          goto win;
        } else {
          // error
          goto fail;
        }

      } else {
        if (errno == EINTR) {
          // not SIGCHLD
          continue;
        } else {
          int status = 0;
          if (wait(&status) == pid) {
            if (WIFEXITED(status)) {
              return { { "error", strerror(WEXITSTATUS(status)) } };
            } else {
              goto fail;
            }
          } else {
            goto fail;
          }
        }
      }
    }
  }

win:
  return {};

fail:
  return { { "error", strerror(errno) } };
}

It turns out that removing the line with the qDebug() call makes the problem go away. Why does adding a debugging call change the behavior of the program?

Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254
CuriousMan
  • 101
  • 6
  • 2
    While `exit` (and `_exit`) accepts an `int`, it really can't handle anything else than a signed byte, meaning the value should be between 0 and 255 (inclusive). If the exit code is outside that range, I think the behavior is undefined. – Some programmer dude Aug 01 '15 at 11:28
  • 2
    Why is this tagged `c`? It looks a lot more like `c++` to me. – EOF Aug 01 '15 at 11:33
  • 1
    @JoachimPileborg Of course you're right but `ENOENT == 2`, so at least for that it should be correct. – CuriousMan Aug 01 '15 at 11:37
  • @EOF It's a mix. The whole POSIX / Linux system API is C though and the relevant bits of this function have nothing to do with C++. – CuriousMan Aug 01 '15 at 11:38
  • I assume that you have checked that`errno` actually *is* what you think it is? Do you print it out after `execvp` fails, or use a debugger to check? – Some programmer dude Aug 01 '15 at 11:39
  • @JoachimPileborg Yes I did. I removed all the debugging printf's from the example above. If I print out the value of `errno` right after `execvp` it's `ENOENT` every time. – CuriousMan Aug 01 '15 at 11:58
  • 1
    It's odd that you block `SIGCHLD` but never unblock it. You will accumulate a queue of pending `SIGCHLD` signals. If you don't want to handle them then you don't need to block them -- they will be ignored by default. Note, however, that that's different from when you explicitly set the signal disposition to `SIG_IGN` -- in the latter case the exit status discarded and you cannot `wait()` on the child, but not in the former. – John Bollinger Aug 01 '15 at 17:22
  • Shouldn't this "*`if (WEXITED(status)`*" read "*`if (WIFEXITED(status))`*"? – alk Aug 01 '15 at 20:01
  • 1
    @JohnBollinger Signals are queued only when using the real-time signals interface, which is not the case here. Standard signals are not queued; if a signal is raised when another instance of that same signal is already pending, only one is delivered when (and if) you unblock it. So there's no accumulation of queued signals here. – Filipe Gonçalves Aug 01 '15 at 23:31
  • @alk Thanks, fixed it. – CuriousMan Aug 02 '15 at 12:22
  • @FilipeGonçalves @JohnBollinger That's interesting, thanks. To be frank I simply followed an example for the `sigtimedwait` bit. – CuriousMan Aug 02 '15 at 12:25
  • Turns out I've inserted a call to `qDebug()` before the `_exit(errno)` piece. Apparently `qDebug()` has some funky side effects, because when I remove it (like in the sample code above), everything works fine. I'll add this as answer and accept it if nobody objects. Thanks everyone for helping out! :) [A debugging facility should not introduce unpredictable side effects which alter the behavior of a program] – CuriousMan Aug 02 '15 at 12:44
  • 1
    Based on your last comment, do I correct understand that you cannot reproduce the problem without a call to `qDebug`? If so, either edit your question to reflect this or delete it as moot. In general, please post [complete code](/help/mcve) to reproduce the problem (including headers, `main`, etc.). – Gilles 'SO- stop being evil' Aug 02 '15 at 13:06
  • @Gilles Yes, that's correct. I've updated the example code reflect the actual problem. I left the "noise" because I thought it was not relevant. I'd never suspected that the call to `qDebug()` would be the cause. – CuriousMan Aug 02 '15 at 13:15

1 Answers1

3
qDebug() << "(" << errno << ") " << strerror(errno);
_exit(errno);

Pretty much any call to a standard library function can modify errno. It's likely that qDebug calls some I/O functions which set errno, or maybe even the << I/O operator. errno is not modified by most successful calls, but the higher level you get, the less you can know that there aren't some normal failed calls under the hood. So the value of errno that you're printing isn't the value of errno that you pass to _exit.

As a general principle with errno, if you're doing anything more complex than just printing it out once, save the value to a variable before doing anything else.

As already remarked in comments, note that most Unix systems including all the common ones only pass 8-bit values as exit statuses, but errno can be greater than 255. For example, if you run this program on a system where 256 is a possible error code, calling _exit(256) would result in the caller seeing a return code of 0, and thus incorrectly believing success.

Generally it's enough to collapse all error values to success/failure. If you need to discriminate more than that, make sure that the information you're passing via exit/wait fits in the range 0–255.

int exec_error = errno;
qDebug() << "(" << exec_error << ") " << strerror(exec_error);
_exit(!!exec_error);
Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254