9

I'm trying to make a program which takes an executable name as an argument, runs the executable and reports the inputs and outputs for that run. For example consider a child program named "circle". The following would be desired run for my program:

$ python3 capture_io.py ./circle
Enter radius of circle: 10
Area: 314.158997
[('output', 'Enter radius of circle: '), ('input',  '10\n'), ('output', 'Area: 314.158997\n')]

I decided to use pexpect module for this job. It has a method called interact which lets the user interact with the child program as seen above. It also takes 2 optional parameters: output_filter and input_filter. From the documentation:

The output_filter will be passed all the output from the child process. The input_filter will be passed all the keyboard input from the user.

So this is the code I wrote:

capture_io.py

import sys
import pexpect

_stdios = []


def read(data):
    _stdios.append(("output", data.decode("utf8")))
    return data


def write(data):
    _stdios.append(("input", data.decode("utf8")))
    return data


def capture_io(argv):
    _stdios.clear()
    child = pexpect.spawn(argv)
    child.interact(input_filter=write, output_filter=read)
    child.wait()
    return _stdios


if __name__ == '__main__':
    stdios_of_child = capture_io(sys.argv[1:])
    print(stdios_of_child)

circle.c

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char* argv[]) {
    float radius, area;

    printf("Enter radius of circle: ");
    scanf("%f", &radius);

    if (radius < 0) {
        fprintf(stderr, "Negative radius values are not allowed.\n");
        exit(1);
    }

    area = 3.14159 * radius * radius;
    printf("Area: %f\n", area);
    return 0;
}

Which produces the following output:

$ python3 capture_io.py ./circle
Enter radius of circle: 10
Area: 314.158997
[('output', 'Enter radius of circle: '), ('input', '1'), ('output', '1'), ('input', '0'), ('output', '0'), ('input', '\r'), ('output', '\r\n'), ('output', 'Area: 314.158997\r\n')]

As you can observe from the output, input is processed character by character and also echoed back as output which creates such a mess. Is it possible to change this behaviour so that my input_filter will run only when Enter is pressed?

Or more generally, what would be the best way to achieve my goal (with or without pexpect)?

Asocia
  • 5,935
  • 2
  • 21
  • 46
  • Linux has related utilites `script` (check the `--log-in` and `--log-out` options) and `tee`. – VPfB Jun 17 '20 at 14:48
  • see [this question](https://stackoverflow.com/questions/10794894/how-to-intercept-transparently-stdin-out-err) – igrinis Jun 17 '20 at 15:39
  • @VPfB I will run this code on the machines that I have no control over it. So requiring yet another program is not good for me. I can't find `--log-in` and `--log-out` options even in my computer. (`script from util-linux 2.31.1`) – Asocia Jun 18 '20 at 09:15
  • @igrinis I think it doesn't do what I want (at least I felt like that when I read it) and too complex than it should be. – Asocia Jun 18 '20 at 09:26
  • @Asocia OK, I was not sure what solution suits your needs. Many prefer existing tools. You are right about `--log-in`, it was added only recently in 2.35. – VPfB Jun 18 '20 at 09:47
  • @VPfB I'm not familiar with `bash` programming so I'm not sure how could I do it. At the end, the only requirement is *simplicity*. Having `tee` command installed is probably not a big deal if it solves the problem with a natural way. When I type `man tee` it says `Copy standard input to each FILE, and also to standard output.` So it *looks like* it separates inputs and outputs. Is it possible to combine these two while preserving the order (i.e. which input goes after which output and vice versa?) – Asocia Jun 18 '20 at 10:59

3 Answers3

1

When I started to write a helper, I realized that the main problem is that the input should be logged line buffered, so the backspace and other editing is done before the input reaches the program, but the output should be unbuffered in order to log the prompt which is not terminated by a new line.

To capture the output for the purpose of logging, a pipe is needed, but that automatically turns on line buffering. It is known that a pseudoterminal solves the problem (the expect module is built around a pseudoterminal), but a terminal has both the input and the output and we want to unbuffer only the output.

Fortunately there is the stdbuf utility. On Linux it alters the C library functions of dynamically linked executables. Not universally usable.

I have modified a Python bidirectional copy program to log the data it copies. Combined with the stdbuf it produces the desired output.

import select
import os

STDIN = 0
STDOUT = 1

BUFSIZE = 4096

def main(cmd):
    ipipe_r, ipipe_w = os.pipe()
    opipe_r, opipe_w = os.pipe()
    if os.fork():
        # parent
        os.close(ipipe_r)
        os.close(opipe_w)
        fdlist_r = [STDIN, opipe_r]
        while True:
            ready_r, _, _ = select.select(fdlist_r, [], []) 
            if STDIN in ready_r:
                # STDIN -> program
                data = os.read(STDIN, BUFSIZE)
                if data:
                    yield('in', data)   # optional: convert to str
                    os.write(ipipe_w, data)
                else:
                    # send EOF
                    fdlist_r.remove(STDIN)
                    os.close(ipipe_w)
            if opipe_r in ready_r:
                # program -> STDOUT
                data = os.read(opipe_r, BUFSIZE)
                if not data:
                    # got EOF
                    break
                yield('out', data)
                os.write(STDOUT, data)
        os.wait()
    else:
        # child
        os.close(ipipe_w)
        os.close(opipe_r)
        os.dup2(ipipe_r, STDIN)
        os.dup2(opipe_w, STDOUT)
        os.execlp(*cmd)
        # not reached
        os._exit(127)

if __name__ == '__main__':
    log = list(main(['stdbuf', 'stdbuf', '-o0', './circle']))
    print(log)

It prints:

[('out', b'Enter radius of circle: '), ('in', b'12\n'), ('out', b'Area: 452.388947\n')]
VPfB
  • 14,927
  • 6
  • 41
  • 75
0

Is it possible to change this behaviour so that my input_filter will run only when Enter is pressed?

Yes, you can do it by inheriting from pexpect.spawn and overwriting the interact method. I will come to that soon.

As VPfB pointed out in their answer, you can't use a pipe and I think it's worth to mentioning that this issue is also addressed in the pexpect's documentation.

You said that:

... input is processed character by character and also echoed back as output ...

If you examine the source code of the interact you can see this line:

tty.setraw(self.STDIN_FILENO)

This will set your terminal to raw mode:

input is available character by character, ..., and all special processing of terminal input and output characters is disabled.

That is why your input_filter function is running for every key press and it sees backspace or other special characters. If you could comment out this line, you would see something like this when you run your program:

$ python3 test.py ./circle
Enter radius of circle: 10
10
Area: 314.158997
[('output', 'Enter radius of circle: '), ('input', '10\n'), ('output', '10\r\n'), ('output', 'Area: 314.158997\r\n')]

And this would also let you edit the input (i. e. 12[Backspace]0 would give you same result). But as you can see, it still echoes the input. This can be disabled by setting a simple flag for child's terminal:

mode = tty.tcgetattr(self.child_fd)
mode[3] &= ~termios.ECHO
tty.tcsetattr(self.child_fd, termios.TCSANOW, mode)

Running with the latest changes:

$ python3 test.py ./circle
Enter radius of circle: 10
Area: 314.158997
[('output', 'Enter radius of circle: '), ('input', '10\n'), ('output', 'Area: 314.158997\r\n')]

Bingo! Now you can inherit from pexpect.spawn and override interact method with these changes or implement the same thing using the builtin pty module of Python:

with pty:
import os
import pty
import sys
import termios
import tty

_stdios = []

def _read(fd):
    data = os.read(fd, 1024)
    _stdios.append(("output", data.decode("utf8")))
    return data


def _stdin_read(fd):
    data = os.read(fd, 1024)
    _stdios.append(("input", data.decode("utf8")))
    return data


def _spawn(argv):
    pid, master_fd = pty.fork()
    if pid == pty.CHILD:
        os.execlp(argv[0], *argv)

    mode = tty.tcgetattr(master_fd)
    mode[3] &= ~termios.ECHO
    tty.tcsetattr(master_fd, termios.TCSANOW, mode)

    try:
        pty._copy(master_fd, _read, _stdin_read)
    except OSError:
        pass

    os.close(master_fd)
    return os.waitpid(pid, 0)[1]


def capture_io_and_return_code(argv):
    _stdios.clear()
    return_code = _spawn(argv)
    return _stdios, return_code >> 8


if __name__ == '__main__':
    stdios, ret = capture_io_and_return_code(sys.argv[1:])
    print(stdios)

with pexpect:

import sys
import termios
import tty
import pexpect

_stdios = []


def read(data):
    _stdios.append(("output", data.decode("utf8")))
    return data


def write(data):
    _stdios.append(("input", data.decode("utf8")))
    return data


class CustomSpawn(pexpect.spawn):
    def interact(self, escape_character=chr(29),
                 input_filter=None, output_filter=None):
        self.write_to_stdout(self.buffer)
        self.stdout.flush()
        self._buffer = self.buffer_type()
        mode = tty.tcgetattr(self.child_fd)
        mode[3] &= ~termios.ECHO
        tty.tcsetattr(self.child_fd, termios.TCSANOW, mode)
        if escape_character is not None and pexpect.PY3:
            escape_character = escape_character.encode('latin-1')
        self._spawn__interact_copy(escape_character, input_filter, output_filter)


def capture_io_and_return_code(argv):
    _stdios.clear()
    child = CustomSpawn(argv)
    child.interact(input_filter=write, output_filter=read)
    child.wait()
    return _stdios, child.status >> 8


if __name__ == '__main__':
    stdios, ret = capture_io_and_return_code(sys.argv[1:])
    print(stdios)

Asocia
  • 5,935
  • 2
  • 21
  • 46
-1

I don't think you'll be able to do that easily, however, I think this should work for you:

output_buffer=''
def read(data):
    output_buffer+=data
    if data == '\r':
         _stdios.append(("output", output_buffer.decode("utf8")))
         output_buffer = ''
    return data

Cargo23
  • 3,064
  • 16
  • 25
  • 1
    Thanks for your suggestion. Unfortunately, I can't do this because `input_filter` is run on **every** key press as I said. So when user writes something like `1[Backspace]5` it will run for backspace too. I only want `5` in this situation. So I'm looking for a way to *change* underlying pty of the child. – Asocia Jun 11 '20 at 19:11