Encoding issue while running Sysinternals Autorunsc via Python subprocess

Question

I have a Python 3.6.0 script where I run autorunsc v13.71 (https://technet.microsoft.com/en-us/sysinternals/bb963902.aspx) on the system (x86 or x86_64 version, according to the system bitness using platform.machine()). If I run autorunsc directly from the terminal (CMD or Powershell) I get the output as expected, no issues (snip from the output):

But, if I try to run it using my code I get this messy output:

I'm using Window's default Notepad to open the output text file. People should be able to read it using Notepad, they won't be able to download a code reader like Notepad++, ST3, etc.

My code (removed some parts to keep it short and direct):

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import platform
import socket
import subprocess
from time import gmtime, strftime
from pathlib import Path

HOSTNAME = socket.gethostname()
SYS_ARCH = platform.machine()  # AMD64 or x86
ROOT_PATH = Path(__file__).parent


def get_current_datetime():
    return strftime("%Y-%m-%d %H:%M:%S UTC%z", gmtime())


def run_command(output_file, command_name, command_args, system_cmd=False):
    output_file.write(f'---------- START [{command_name} {SYS_ARCH}] {get_current_datetime()} ----------\n')
    output_file.flush()

    file_path = command_name if system_cmd else str(ROOT_PATH / 'tools' / SYS_ARCH / (command_name + '.exe'))
    subprocess.call([file_path] + command_args, stdout=output_file, shell=True, universal_newlines=True)

    output_file.write(f'---------- ENDED [{command_name} {SYS_ARCH}] {get_current_datetime()} ----------\n\n')
    output_file.flush()
    print(f'[*] [{command_name} {SYS_ARCH}] done')


def main():
    output_file = ROOT_PATH.parent / (HOSTNAME + '.txt')
    with open(output_file, 'w', encoding='utf-8') as fout:
        run_command(output_file=fout, command_name='autorunsc', command_args=['-h', '-nobanner', '-accepteula'])


if __name__ == '__main__':
    main()

File structure:

folder\
- app.py (the code shown here)
- tools\
  - AMD64\
    - autorunsc.exe
  - x86\
    - autorunsc.exe

I believe it's something to do with the output of autorunsc, I read somewhere it returns the output encoded as UTF-16. The thing is that I run many other Sysinternals EXEs and append the output to the same file (using my run_command function), and all of them work flawlessly, but this one. How can I get this right?

score 0 · Answer 1 · answered Jun 12 '17 at 22:34

0

To open your file in Microsoft Notepad, you must use Microsoft new lines: \r\n (CR LF).

The open function in Python 3 has a newline parameter for that.

You can fix your code like this:

with open(output_file, 'w', encoding='utf-8', newline='\r\n') as fout:
    run_command(output_file=fout, command_name='autorunsc', command_args=['-h', '-nobanner', '-accepteula'])

answered Jun 12 '17 at 22:34

Laurent LAPORTE

21,958
6
58
103

Didn't fix the issue. Got the same messy output. The line break for all the other outputs are fine. `autorunsc` isn't the only .exe I run with my code, there are others Sysinternals tools that I run and the output is correctly formatted. – JChris Jun 12 '17 at 22:42
You can use `splitlines` to split to output of your command and then write them into your file. – Laurent LAPORTE Jun 12 '17 at 22:47
No I can't, take a look at `subprocess.call([file_path] + command_args, stdout=output_file, stderr=output_file, shell=True, universal_newlines=True)`. It automatically sends to output to the file (`output_file`). I don't have any control over the lines. – JChris Jun 12 '17 at 22:49
Your screen copy seems to have a [BOM](https://en.m.wikipedia.org/wiki/Byte_order_mark). And is probably encoded in UTF16. – Laurent LAPORTE Jun 12 '17 at 22:50
Use `check_output` instead of `call` to get the sub process output. – Laurent LAPORTE Jun 12 '17 at 22:51
I changed it to `result = subprocess.check_output([file_path] + command_args, stderr=output_file)` and then I created a loop: `for line in result.decode('utf-16'): output_file.write(line)`, but got `UnicodeDecodeError: 'utf-16-le' codec can't decode byte 0x0a in position 10982: truncated data` as soon I started the code. – JChris Jun 12 '17 at 23:00

score 0 · Answer 2 · answered Jun 13 '17 at 00:59

I found the solution. Indeed, the issue was the encoding for the autorunsc tool output. It's in UTF16 while the rest is UTF8, this is what I did:

# IF-ELSE to handle the 'autorunsc' output, which is UTF16
if command_name == 'autorunsc':
    result = subprocess.Popen([file_path] + command_args, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    text = result.stdout.read().decode('UTF16')
    for line in text:
        output_file.write(line)
else:
    subprocess.call([file_path] + command_args, stdout=output_file, stderr=output_file)

With this code I can have all the outputs inside my single UTF-8 file.txt.

Encoding issue while running Sysinternals Autorunsc via Python subprocess

2 Answers2