1

UPDATE 3: This problem appears to be a part of the module installation of node-sass. The stranded process has a working directory of ./node_modules/node-sass, and its command-line scripts/install.js resolves to a file within the module. Furthermore, the last line of output that reaches the console matches a line output by node-sass' scripts/install.js:

if (cachedBinary) {
    console.log('Cached binary found at', cachedBinary);
    fs.createReadStream(cachedBinary).pipe(fs.createWriteStream(binaryPath));
    return;
}

This code does not have any issues when run from the command-line (i.e., simply issuing npm install at the command prompt with a blank node_modules directory), but when npm install is launched via popen3, it appears that the stream .pipe call here blocks indefinitely.

This is a head scratcher for me at the moment...

If I ^C the terminal where Ruby has launched these child processes, the interrupt makes it to the rogue process and causes it to terminate. However, forcing all the pipes closed (or simply terminating the parent process) does not cause the rogue node.exe to exit.

I contemplated an alternative version of popen3 that explicitly waits on the child process instead of just implicitly waiting for the streams to all come to an end, but while this does permit the calling side to proceed properly, the rogue child process still hangs around, and would interfere with subsequent invocations by holding an open handle to the ./node_modules/node-sass directory.

UPDATE 4: I have opened this bug report with the node-sass project: https://github.com/sass/node-sass/issues/2459


UPDATE: I'm pretty sure this is actually a Node issue. I tracked down the root cause of the hang, and it is that through a complex tree of child processes, npm install ultimately leaves behind an instance of node.exe that just sits there, apparently indefinitely, doing nothing, keeping the stdout and stderr pipes it has inherited open.

So, this leaves new questions:

  1. Is there a way to make Node not leave behind a straggler process after an npm install completes?
  2. Is there a way to explicitly wait for the direct child process of popen3 to exit, instead of waiting for the streams to end, and then possibly close the streams from the listening side to terminate the threads pumping the output?

UPDATE 2: I have reproduced the problem with this minimalist code:

Open3::popen3 "npm install" do |stdin, stdout, stderr, thr|
  stdin.close
  stdout.each_line { |l| puts l }
end

With this code, the rogue node.exe process (command-line: scripts/install.js) hangs around after the npm install completes. Terminating the process unblocks the popen3 call (by causing stdout to come to an end, so the each_line loop terminates), and ^Cing the Ruby code (when running in an IRB window) causes the rogue node.exe to terminate (following a line in the console output: => #<IO:(closed)>).

This only happens when the process is run through popen3; the identical npm install from a CMD prompt exits normally.


Original question:

I'm having a problem with popen3 in a Ruby script. It's hanging, but I'm pretty sure it's not any of the usual candidates. I've updated my call to popen3 with tons of annotation so that I can see in the console output what's going on. Here is how I'm making the call:

  command_output_lines = []
  lock = Mutex.new

  exit_code = nil

  Logger.log("[MAIN] beginning popen3 block")

  Open3.popen3(command_w_params) do |stdin, stdout, stderr, thr|
    Logger.log("[MAIN] closing stdin stream")

    stdin.close

    Logger.log("[MAIN] starting [STDOUT]")

    stdout_thread = Thread.new do
      Logger.log("[STDOUT] started")

      begin
        stdout.each_line do |stdout_line|
          Logger.log("[STDOUT] got a line, acquiring lock")

          lock.synchronize do
            command_output_lines <<= stdout_line
            Logger.log(stdout_line)
          end

          Logger.log("[STDOUT] lock released")
        end
      rescue Exception => e
        Logger.log("[STDOUT] exception: #{e}")
      end

      Logger.log("[STDOUT] exiting")
    end

    Logger.log("[MAIN] starting [STDERR]")

    stderr_thread = Thread.new do
      Logger.log("[STDERR] started")

      begin
        stderr.each_line do |stderr_line|
          Logger.log("[STDERR] got a line, acquiring lock")

          lock.synchronize do
            command_output_lines <<= "[STDERR] " + stderr_line
            Logger.warn(stderr_line)
          end

          Logger.log("[STDERR] lock released")
        end
      rescue Exception => e
        Logger.log("[STDERR] exception: #{e}")
      end

      Logger.log("[STDERR] exiting")
    end

    Logger.log("[MAIN] joining to [STDOUT]")
    stdout_thread.join
    Logger.log("[MAIN] joining to [STDERR]")
    stderr_thread.join

    Logger.log("[MAIN] threads joined, reading exit status")

    exit_code = thr.value.exitstatus
  end

  Logger.log("[MAIN] popen3 block completed")

(Never mind what exactly Logger.log is; just know that it sends output to the console.)

Where I'm seeing the problem, command_w_params is equal to npm install, and this code is running in the context of a bundle exec rake TaskName.

When it reaches this code, I see the following console output:

[MAIN] beginning popen3 block
[MAIN] closing stdin stream
[MAIN] starting [STDOUT]
[MAIN] starting [STDERR]
[MAIN] joining to [STDOUT]
[STDOUT] started
[STDERR] started
[STDOUT] got a line, acquiring lock

[STDOUT] lock released
[STDOUT] got a line, acquiring lock
> node-sass@4.9.2 install C:\Users\Jonathan Gilbert\RepositoryName\ProjectName\node_modules\node-sass
[STDOUT] lock released
[STDOUT] got a line, acquiring lock
> node scripts/install.js
[STDOUT] lock released
[STDOUT] got a line, acquiring lock

[STDOUT] lock released
[STDOUT] got a line, acquiring lock
Cached binary found at C:\Users\Jonathan Gilbert\AppData\Roaming\npm-cache\node- sass\4.9.2\win32-x64-57_binding.node
[STDOUT] lock released

...and then it just hangs. At this point, I can see in Process Explorer that the child process has exited. There is nothing left but ruby.exe, but it just sits there indefinitely until it is explicitly cancelled. The two threads are still running, indicating that the stdout and stderr streams haven't signalled end-of-stream yet.

Now, often when people have a problem with popen3, it's because they're not reading both stdout and stderr simultaneously, and one or the other fills up its pipe buffer while the parent process is only paying attention to the other. But my code is using separate threads and keeping the pipe buffers empty.

Another problem I've seen come up is that the child process may be sticking around waiting for stdin to be closed, but in this case:

  • stdin is being closed.
  • The child process doesn't even exist any more.

Does anybody recognize these symptoms? Why are the stdout and stderr streams not hitting end-of-stream when the child process exits??

Jonathan Gilbert
  • 3,526
  • 20
  • 28

0 Answers0