2

I'm starting a process through boost::process. The process uses std::cout and std::cerr to output some information. I need to retrieve those information. At some point, I want to be able to store those outputs preserving order and severity (output from cout or cerr).

But I could not achieve that considering the way boost::process redirects the outputs. I could only redirect std::cout to a specific ipstream and std::cerr to another. Then, when reading them, I can't preserve the order.

Here is a MCVE isolating teh problem:

#include <iostream>

#include <boost/process.hpp>
#include <boost/thread.hpp>
#include <boost/bind.hpp>

void doReadOutput( boost::process::ipstream* is, boost::process::ipstream* err, std::ostream* out )
{
    std::string line;
    
    if ( std::getline( *is, line ) ) 
        *out << "cout: " << line << std::endl;
    if ( std::getline( *err, line ) ) 
        *out << "cerr: " << line << std::endl;
}

void readOutput( boost::process::ipstream* is, boost::process::ipstream* err, std::ostream* out, std::atomic_bool* continueFlag )
{
    std::string line;
    while ( *continueFlag )
    {
        doReadOutput( is, err, out );
    }

    // get last outputs that may remain in buffers
    doReadOutput( is, err, out );
}

int main( int argc, char* argv[] )
{
    if ( argc == 1 )
    {
        // run this same program with "foo" as parameter, to enter "else" statement below from a different process
        try
        {
            boost::process::ipstream is_stream, err_stream;
            std::stringstream merged_output;
            std::atomic_bool continueFlag = true;

            boost::process::child child( argv[0],
                                         std::vector<std::string>{ "foo" },
                                         boost::process::std_out > is_stream,
                                         boost::process::std_err > err_stream );

            boost::thread thrd( boost::bind( readOutput, &is_stream, &err_stream, &merged_output, &continueFlag ) );

            child.wait();

            continueFlag = false;

            thrd.join();

            std::cout << "Program output was:" << std::endl;
            std::cout << merged_output.str();
        }
        catch ( const boost::process::process_error& err )
        {
            std::cerr << "Error: " << err.code() << std::endl;
        }
        catch (...)                                                                                                                                      // @NOCOVERAGE
        {                                                     
            std::cerr << "Unknown error" << std::endl;
        }
    }
    else
    {
        // program invoked through boost::process by "if" statement above

        std::cerr << "Error1" << std::endl;
        std::cout << "Hello World1" << std::endl;
        std::cerr << "Error2" << std::endl;
        std::cerr << "Error3" << std::endl;
        std::cerr << "Error4" << std::endl;
        std::cerr << "Error5" << std::endl;
        std::cout << "Hello World2" << std::endl;
        std::cerr << "Error6" << std::endl;
        std::cout << "Hello World3" << std::endl;
    }

    return 0;
}

When I execute this program (Compiled with Visual Studio 2019 under Windows 10), it outputs:

Program output was:
cout: Hello World1
cerr: Error1
cout: Hello World2
cerr: Error2
cout: Hello World3
cerr: Error3
cerr: Error4
cerr: Error5
cerr: Error6

While I want:

Program output was:
cerr: Error1
cout: Hello World1
cerr: Error2
cerr: Error3
cerr: Error4
cerr: Error5
cout: Hello World2
cerr: Error6
cout: Hello World3

Is there any way to achieve that?


Edit, as suggested by Some programmer dude, created one thread per output stream:

#include <iostream>

#include <boost/process.hpp>
#include <boost/thread.hpp>
#include <boost/bind.hpp>
#include <boost/thread/mutex.hpp>

void doReadOutput( boost::process::ipstream* str, std::ostream* out, const std::string& prefix, boost::mutex* mutex )
{
    std::string line;

    if ( std::getline( *str, line ) )
    {
        boost::mutex::scoped_lock lock( *mutex );
        *out << prefix << ": " << line << std::endl;
    }
}

void readOutput( boost::process::ipstream* str, std::ostream* out, std::string prefix, boost::mutex* mutex, std::atomic_bool* continueFlag )
{
    while ( *continueFlag )
    {
        doReadOutput( str, out, prefix, mutex );
        boost::thread::yield();
    }

    // get last outputs that may remain in buffers
    doReadOutput( str, out, prefix, mutex );
}

int main( int argc, char* argv[] )
{
    if ( argc == 1 )
    {
        // run this same program with "foo" as parameter, to enter "else" statement below from a different process
        try
        {
            boost::process::ipstream is_stream, err_stream;

            std::stringstream merged_output;
            std::atomic_bool continueFlag = true;

            boost::process::child child( argv[0],
                                         std::vector<std::string>{ "foo" },
                                         boost::process::std_out > is_stream,
                                         boost::process::std_err > err_stream );

            boost::mutex mutex;
            boost::thread thrdis( boost::bind( readOutput, &is_stream, &merged_output, "cout", &mutex, &continueFlag ) );
            boost::thread thrderr( boost::bind( readOutput, &err_stream, &merged_output, "cerr", &mutex, &continueFlag ) );

            child.wait();

            continueFlag = false;

            thrdis.join();
            thrderr.join();

            std::cout << "Program output was:" << std::endl;
            std::cout << merged_output.str();
        }
        catch ( const boost::process::process_error& err )
        {
            std::cerr << "Error: " << err.code() << std::endl;
        }
        catch (...)                                                                                                                                      // @NOCOVERAGE
        {                                                     
            std::cerr << "Unknown error" << std::endl;
        }
    }
    else
    {
        // program invoked through boost::process by "if" statement above

        std::cerr << "Error1" << std::endl;
        std::cout << "Hello World1" << std::endl;
        std::cerr << "Error2" << std::endl;
        std::cerr << "Error3" << std::endl;
        std::cerr << "Error4" << std::endl;
        std::cerr << "Error5" << std::endl;
        std::cout << "Hello World2" << std::endl;
        std::cerr << "Error6" << std::endl;
        std::cout << "Hello World3" << std::endl;
    }

    return 0;
}

Then the output is:

Program output was:
cerr: Error1
cout: Hello World1
cerr: Error2
cout: Hello World2
cerr: Error3
cout: Hello World3
cerr: Error4
cerr: Error5
cerr: Error6

Still unexpected...

jpo38
  • 20,821
  • 10
  • 70
  • 151
  • You have to think about the order in which you do things... Like you *always* first read from "cout" and from "cerr". You need some synchronization between the processes, so the parent process know what channel ("cout" or "cerr") to read from. – Some programmer dude Mar 01 '21 at 09:28
  • @Someprogrammerdude: But the process I run in a 3rd party program, I can't sync anything here. I would expect a boost::process syntax that would redirect both `boost::process::std_out` and `boost::process::std_err` to the same `ipstream` preserving the order, but I could not achieve that. – jpo38 Mar 01 '21 at 09:30
  • @Someprogrammerdude: If I read `err` before `is` the output is different, but still not what's expected. – jpo38 Mar 01 '21 at 09:37
  • @jpo38 I suggest you use asio + process. There is this useful function `async_read_until` that will allow you to react on every line asynchronously. Because what you are doing right now won't work in any way. You just read stdout and stderr and expect them to be read in order they were written to. And you are using `boost::process` not in the way it is meant to be used – bartop Mar 01 '21 at 09:47
  • @bartop: Could you please provide an example? I tried to call `async_read_unitl` with a `boost::process::ipstream` object but it does not work as it does not have a `async_read_some` function available... – jpo38 Mar 01 '21 at 11:01
  • A possible work-around might be to have two threads, one for each stream. – Some programmer dude Mar 01 '21 at 11:05
  • That's what I had in my code before I created the MCVE. But that does not guarantee the order to be preserved....with and without threads, it gets messed up. – jpo38 Mar 01 '21 at 11:10
  • @Someprogrammerdude: Just updated my question with the piece of code using one thread per stream and the output, which remains unexpected. – jpo38 Mar 01 '21 at 14:08
  • Two separate file handles are not synchronized in any way. The best way to simulate it that I've seen is to do the output one line at a time. That way instead of 8,000 bytes of stdout, then 8,000 bytes of stderr you get alternating output lines of stdout, stderr, repeat. – Zan Lynx Mar 02 '21 at 18:40

1 Answers1

2

You will need non-blocking IO. The supported way in the library is by using asynchronous pipes.

You would run a loop for both stderr/stdout doing

  • async_read into a buffer until you get a full line or more
  • copy the line from input buffer to the output buffer as soon as it became available

Because you'll end up having twice very much the same loop over pipe/buffer state, it makes sense to encapsulate it into a type, e.g.

    struct IoPump {
        IoPump(io_context& io, std::string& merged) : _pipe(io), _merged(merged) {}

        boost::asio::streambuf _buf;
        bp::async_pipe         _pipe;
        std::string&           _merged;

        void do_loop();
    };

    io_context io;

    std::string merged;
    IoPump outp{io, merged}, errp{io, merged};

    bp::child child(program, std::vector<std::string> { "foo" },
        bp::std_out > outp._pipe, bp::std_err > errp._pipe);

    outp.do_loop(); // prime the pump
    errp.do_loop(); // prime the pump
    io.run();

That's all. Well, except of course, what IoPump::do_loop() actually does:

void do_loop() {
    boost::asio::async_read_until(_pipe, _buf, "\n",
        [this, out = boost::asio::dynamic_buffer(_merged)](
            error_code ec, size_t xfer) mutable {
            if (!ec) {
                out.commit(buffer_copy(
                    out.prepare(xfer), _buf.data(), xfer));
                _buf.consume(xfer);

                do_loop(); // chain
            } else {
                std::cerr << "IoPump: " << ec.message() << "\n";
            }
        });
}

Note that

  • your main application is completely single-threaded
  • meaning that async completion handlers never run concurrently
  • meaning that it is safe to just access the std::string merged; output buffer directly without worrying about synchronization

Live Demo

Live On Coliru

static void main_program(char const* program);
static void child_program();

int main(int argc, char** argv) {
    if (argc == 1)
        main_program(argv[0]);
    else
        child_program();
}

#include <iostream>
static void child_program() {
    std::cerr << "Error1"       << std::endl;
    std::cout << "Hello World1" << std::endl;
    std::cerr << "Error2"       << std::endl;
    std::cerr << "Error3"       << std::endl;
    std::cerr << "Error4"       << std::endl;
    std::cerr << "Error5"       << std::endl;
    std::cout << "Hello World2" << std::endl;
    std::cerr << "Error6"       << std::endl;
    std::cout << "Hello World3" << std::endl;
}

#include <boost/process.hpp>
#include <boost/asio.hpp>

static void main_program(char const* program) {
    namespace bp = boost::process;
    try {
        using boost::system::error_code;
        using boost::asio::io_context;

        struct IoPump {
            IoPump(io_context& io, std::string& merged) : _pipe(io), _merged(merged) {}

            boost::asio::streambuf _buf;
            bp::async_pipe         _pipe;
            std::string&           _merged;

            void do_loop() {
                boost::asio::async_read_until(_pipe, _buf, "\n",
                    [this, out = boost::asio::dynamic_buffer(_merged)](
                        error_code ec, size_t xfer) mutable {
                        if (!ec) {
                            out.commit(buffer_copy(
                                out.prepare(xfer), _buf.data(), xfer));
                            _buf.consume(xfer);

                            do_loop(); // chain
                        } else {
                            std::cerr << "IoPump: " << ec.message() << "\n";
                        }
                    });
            }
        };

        io_context io;

        std::string merged;
        IoPump outp{io, merged}, errp{io, merged};

        bp::child child(program, std::vector<std::string> { "foo" },
            bp::std_out > outp._pipe, bp::std_err > errp._pipe);

        outp.do_loop(); // prime the pump
        errp.do_loop(); // prime the pump
        io.run();

        std::cout << "Program output was:" << std::endl;
        std::cout << merged;
    } catch (const bp::process_error& err) {
        std::cerr << "Error: " << err.code().message() << std::endl;
    } catch (...) { // @NOCOVERAGE
        std::cerr << "Unknown error" << std::endl;
    }
}

Prints

IoPump: End of file
IoPump: End of file

And standard output:

Program output was:
Error1
Error2
Hello World1
Error3
Hello World2
Error4
Hello World3
Error5
Error6

Other Examples

I've got many examples on this site already. Just look for async_pipe

Thinking Out Of The Box

You could simply redirect stderr into stdout at the descriptor level and be done! E.g.

Live On Coliru

    boost::asio::io_context io;
    std::future<std::string> merged;

    bp::child child(program, std::vector<std::string> { "foo" },
        bp::std_out > merged, bp::posix::fd.bind(2, 1), io);

    io.run();

    std::cout << "Program output was:" << std::quoted(merged.get()) << "\n";

Or with a line-wise reading loop:

Live On Coliru

    bp::ipstream merged;

    bp::child child(program, std::vector<std::string> { "foo" },
        bp::std_out > merged, bp::posix::fd.bind(2, 1));

    child.wait();

    std::cout << "Program output was:" << std::endl;
    for (std::string line; getline(merged, line);)
        std::cout << "merged: " << std::quoted(line) << "\n";

Printing

Program output was:
merged: "Error1"
merged: "Hello World1"
merged: "Error2"
merged: "Error3"
merged: "Error4"
merged: "Error5"
merged: "Hello World2"
merged: "Error6"
merged: "Hello World3"
sehe
  • 374,641
  • 47
  • 450
  • 633
  • Thanks sehe. The code looks good, but your program still does not produce the expected output. For instance `Error2` should be printed after `Hello World1`... – jpo38 Mar 01 '21 at 15:48
  • I suspect is has more to do with Coliru than the code. Interesting, but try it for yourself. I cannot reproduce the Coliru result. [Wandbox](https://wandbox.org/permlink/IOXRytWiaRAm2MqF) and others concur – sehe Mar 01 '21 at 16:09
  • Just tested the code using Visual Studio 2019. I get a different output than Coliru, but still not what's expected. I see `Hello World1, Error1, Hello World2, Error2, Hello World3, Error3, Error4, Error5, Error6` while expecting `Error1, Hello World1, Error2, Error3, Error4, Error5, Hello World2, Error6, Hello World3`. – jpo38 Mar 01 '21 at 16:41
  • Ah. I see. The problem is - as I feared - that the `read_until` call can read multiple lines at once, but because I tried to hide complexity by "just leaving the extra data in the buffer for next iteration" we get the false ordering (because the other completion handler comes first). So, you **do** need to manually check for multiple lines in one completion – sehe Mar 01 '21 at 18:00
  • Then how would that affect the code you posted? Would adding a yield within the loop function to let the other wake up help? – jpo38 Mar 01 '21 at 18:56
  • There are no threads. How would a yield help? Also, if anything you want to **prevent** the other party from "going first" - the opposite of yield. However, I don't current;y have time to test alternatives. – sehe Mar 01 '21 at 20:57
  • OK, I see, it's true it's completely single thread. However, it does not solve my issue that is specifically the output ordering.... – jpo38 Mar 02 '21 at 07:37
  • I believe you mentioned this on occasion :) However, I just realized. I may not have the ready answer, but we never - in good [SO] style - stopped to question the premise. Instead of trying to merge two async streams we could have a single one to begin with: http://coliru.stacked-crooked.com/a/a665dd78b1839c80. In fact, you don't even need the pipe or the streambuffer any more. Just put it in a string or vector directly: http://coliru.stacked-crooked.com/a/f7d08deb46798baf – sehe Mar 02 '21 at 17:11
  • That's really great, it's now synchronized. However, one last thing is missing ;-). I need to know what comes from `cout` and what comes from `cerr`. See in my question, what comes from `cerr` is prefixed `cerr:` and what comes from `cout` is prefixed `cout:`. With your edit solution, everything is prefixed `merged:`. If you have an idea...that would make your answer just perfect ;-) – jpo38 Mar 02 '21 at 21:36
  • Arg. The edit crossed with my work on the answer. Back to square one. Godspeed! – sehe Mar 02 '21 at 21:56
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/229414/discussion-between-sehe-and-jpo38). – sehe Mar 02 '21 at 21:59
  • Aaarrrgh - I was so hoping you managed to get this working. I've the same sort of issue; my child is created using `(bp::stdout & bp::stderr) > pipe_` where `pipe_` is an `async_pipe`, and I use `async_read_some` to read from it. I see large blocks of `cerr` output then large blocks of `cout` output, but order is not preserved between the two. – cosimo193 May 22 '23 at 20:52