-1

I'm writing a Python app that runs a command on an AWS remote docker container, and saves the output to a file. The command that is being run remotely is generating binary data (a database dump).

The app works great if I start the download and don't touch anything. The issue I'm having is that if I start the download, and hit Enter while it's downloading, or scroll my mouse wheel in the terminal window, my output file gets a ^M, or weird characters.

Sample Code:

#!/usr/bin/env python3

import npyscreen
import curses
import subprocess

MY_REGION=...
MY_CLUSTER=...
MY_TASK=...
MY_CONTAINER=...

class ProgressForm(npyscreen.Popup):
    def create(self):
        self.progress = self.add(
            npyscreen.TitleSliderPercent, step=1, out_of=100, name="Progress"
        )

    def activate(self):
        cmd = subprocess.Popen(
            [
                "aws",
                "--region",
                MY_REGION,
                "ecs",
                "execute-command",
                "--cluster",
                MY_CLUSTER,
                "--task",
                MY_TASK,
                "--container",
                MY_CONTAINER,
                "--command",
                "python -c 'for i in range(500_000): print(i)'",
                "--interactive",
            ],
            stdout=subprocess.PIPE,
            stderr=subprocess.STDOUT,
            bufsize=0,
        )

        total_size = 3889129
        downloaded = 0
        with open("out.log", "wb") as f:
            while True:
                chunk = cmd.stdout.read(1024)

                if not chunk:
                    break

                f.write(chunk)

                downloaded += len(chunk)

                self.progress.set_value(min(downloaded/total_size*100, 100))
                self.progress.display()

        self.parentApp.switchForm(None)

class MAIN(npyscreen.FormBaseNew):
    def create(self):
        self.items = self.add(
            npyscreen.GridColTitles,
            col_titles=["Column"],
            select_whole_line=True,
        )
        self.items.add_handlers({curses.ascii.NL: self.item_chosen})

    def activate(self):
        for i in range(4):
            self.items.values = [
                ["Row Data"]
            ]

        self.edit()

    def item_chosen(self, inpt):
        self.parentApp.switchForm("progressForm")

class App(npyscreen.NPSAppManaged):
    def onStart(self):
        self.addForm("MAIN", MAIN, name="My App")
        self.addForm("progressForm", ProgressForm)

if __name__ == "__main__":
    app = App().run()

Hitting Enter during the download, or scrolling the mouse wheel results in this:

...
10667

10668
10669
...

and this:

...
17451
17452
17453
^[[<65;121;31M17454
17455
17456
17457
...

Why is my subprocess' stdout being littered with junk data?

Edit: The full output can be found here

John
  • 2,551
  • 3
  • 30
  • 55
  • Some of what you've pointed out as "junk" are terminal control sequences. Think instructions to change output color and the like. Well-written software generally turns output coloring off when stdout isn't a TTY, but not all software is well-written. – Charles Duffy Nov 09 '22 at 16:59
  • 1
    I don't know the AWS ECS tools well, but I wouldn't be surprised if `--interactive` was causing a TTY to be presented, and/or enabling some amount of interaction intended for human (rather than programmatic) consumers. – Charles Duffy Nov 09 '22 at 17:01
  • 2
    BTW -- please provide transcripts of the "junk" as text, not screenshots. I can't copy-and-paste from a screenshot to look up what a control sequence does on common terminal types; I _could_ if you'd provided text. (See also [Why should I not upload images of code/data/errors?](https://meta.stackoverflow.com/a/285557/14122)) – Charles Duffy Nov 09 '22 at 17:02
  • @CharlesDuffy link to full output added. Unfortunately, `--interactive` is a required argument, and the only option (there's no `--non-interactive` or anything). In this case, these control sequences aren't static. They are being generated in response to me hitting buttons on my keyboard during download. What is detecting this? The `aws` binary that I'm running with subprocess? Is there a way for me to hide the signal, so it doesn't know I'm pressing any buttons? – John Nov 09 '22 at 17:11
  • A _link_ to the output is even more useless than images for most visitors. Just transcribe the output like you did the code. We don't need to see all of it, just enough to understand your problem statement. – tripleee Nov 09 '22 at 18:15
  • Try setting `stdin=subprocess.DEVNULL` for the subprocess. – tripleee Nov 09 '22 at 18:16
  • @tripleee I tried that. It sounds like the aws binary expects stdin to be open, because setting it to `subprocess.DEVNULL` starts running, gets to about the count of 430, and then writes: `Cannot perform start session: EOF`. Can I maybe set it to a fake empty stream or something? – John Nov 09 '22 at 19:08
  • 1
    I'm not sure why you're so surprised about hitting Enter doing the thing that hitting Enter usually does. I *am* surprised the scroll wheel is making control sequences show up in the output instead of scrolling, though. What terminal emulator are you using? – user2357112 Nov 09 '22 at 19:10
  • Unless stdin is being plumbed through to the subprocess, though, it shouldn't care about any of that. `stdin=subprocess.DEVNULL` may be your friend. – Charles Duffy Nov 09 '22 at 19:13
  • 2
    BTW, https://docs.aws.amazon.com/cli/latest/reference/ecs/execute-command.html very much does say that `--non-interactive` is a thing that exists. – Charles Duffy Nov 09 '22 at 19:15
  • Unfortunately, `interactive` is the [only supported mode](https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_ExecuteCommand.html): `Amazon ECS only supports initiating interactive sessions, so you must specify true for this value.`. Specifying `--non-interactive` errors out with `Interactive is the only mode supported currently.`. – John Nov 09 '22 at 21:07
  • BTW, just to be very sure -- is this actually present in your subprocess's stdout, or is it local echo _in your terminal_ emitting this content? If your local TTY is configured with local echo related, it might not be the subprocess involved anywhere at all. (If you're not sure which of the two it is, maybe modify the Python program to use `repr()` or such to escape the content coming back from the remote session; you can see if your locally-entered content is similarly escaped). – Charles Duffy Nov 09 '22 at 21:20
  • ...one easy way to distinguish would be just to have a `sleep(10)` instead of any kind of subprocess interaction at all, and check whether you still see the same behavior during that time; if you do, then you know it's local echo and that the subprocess was a red herring. – Charles Duffy Nov 09 '22 at 21:22
  • @CharlesDuffy - after days of hitting a brick wall, setting stdin to `subprocess.PIPE` finally fixed the issues! Thank you! Did you delete your answer? I was going to mark it as accepted. Note that another approach that also worked was using `os.pipe()` to create a new file descriptor, and set the child process' stdin to that, but it sounds like that's just a longer way to do the same thing. – John Nov 09 '22 at 22:05
  • I deleted the answer because I wasn't confident that the subprocess was really part of the problem; but you've cleared that up, so it's undeleted now. – Charles Duffy Nov 09 '22 at 22:07

1 Answers1

1

When you don't specify what subprocess should do with stdin, it gets inherited from the parent process, letting the child see your enter keys, scroll-wheel data, etc.

A typical noninteractive process won't do "local echo" of input back to output; but you're using --interactive here, so the behavior is not surprising.

Set stdin=subprocess.DEVNULL to explicitly route stdin from nowhere (stdin connected to /dev/null shows up as an immediate EOF on the first attempted read; most programs that aren't written to require input will handle this correctly).

If the program requires there to be a stdin stream that isn't immediately closed, you might instead use stdin=subprocess.PIPE, and then leave cmd.stdin alone until it's time for the remote program to exit (at which point a cmd.stdin.close(), while not strictly mandatory, would not be remiss).

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • Very interesting, but unfortunately, setting `stdin=subprocess.DEVNULL` errors out with `Cannot perform start session: EOF`. Strangely, this doesn't happen right away, but after a short time of it initially working. Anything else I can try to not let the child process inherit stdin from the parent process? – John Nov 09 '22 at 21:08
  • `stdin=subprocess.PIPE`, then, and just don't write anything to the pipe; call `cmd.stdin.close()` when you're ready for the program to exit. – Charles Duffy Nov 09 '22 at 21:10