1

The problem I am having is that after running the program, it errors out saying that my argv contains 1 entries and it needs it to be 0

import subprocess
args = ['/challenge/embryoio_level75']
subprocess.run(args)
Reblochon Masque
  • 35,405
  • 10
  • 55
  • 80
  • 2
    That should do it. What's the problem? – Barmar Feb 18 '22 at 21:06
  • 1
    Pay attention to the fact that this is an absolute path. – Cubix48 Feb 18 '22 at 21:08
  • 1
    Welcome to Stack Overflow. Please read [ask] and https://meta.stackoverflow.com/questions/359146 and https://xyproblem.info and https://ericlippert.com/2014/03/05/how-to-debug-small-programs/. What happens when you run the code? How is that different from what is supposed to happen? What actually is your question? – Karl Knechtel Feb 18 '22 at 21:22
  • 2
    How is this question not fitting stackoverflow's scope? – Reblochon Masque Feb 18 '22 at 21:49
  • The problem i am having is that after running the program it errors out saying that my argv contains 1 entries and it needs it to be 0 – Albert Waweru Feb 18 '22 at 22:00

1 Answers1

11

Update from rici's comment:

Apparently this hack is possible too:

import subprocess
subprocess.run([], executable="/path/to/executable")

Python, C/C++ and other languages are reserving the first value (argv[0]) as the value of the executable that's currently running and is set automatically.

For Python the sys.argv[0] would be one of:

  • -c
  • the script name
  • "" (empty string)

You can test it with:

exec -a "" python -c 'import sys;print(sys.argv[0]);input("pause")'
exec -a "" python <file>
exec -a "" python  # then print sys.argv[0] manually

Python itself is using os.execvpe() under the hood for process spawning which is a Linux/Posix function. This function prohibits empty args parameter which is then passed to the child process itself.

A very simple C program for just printing the argv[0] is below. Alternatively import sys;print(sys.argv[0]) but spawning with python -c '<code>' or python <file> may add unnecessary params, therefore I use C to prevent the noise.

// gcc -o a.out main.c
#include <stdio.h>

int main(int argc, char* argv[]) {
    printf("%s\n", argv[0]);
    return 0;
}

Spawn a child process with Python directly with os.execvpe()

import os
os.execvpe("./a.out", [], {})  # error
os.execvpe("./a.out", ["-"], {})  # but this works

It'll still give you the first argument in the child process, it requires you to define the value, so you can use anything instead of the executable's path.

Even using os.execl() (like in C) and similar ones are handled so they prevent you not passing the value, unfortunately, even though the standard allows such a behavior.

The various exec* functions take a list of arguments for the new program loaded into the process. In each case, the first of these arguments is passed to the new program as its own name rather than as an argument a user may have typed on a command line. For the C programmer, this is the argv[0] passed to a program’s main(). For example, os.execv('/bin/echo', ['foo', 'bar']) will only print bar on standard output; foo will seem to be ignored. (source)

The funniest part is, that CPython's implementation uses _execvpe() whose exec funcs are "undefined" for a normal lookup. It's probably hot-patched similarly like the builtins are created, from C and directly by manipulating the module's globals storage.

If we dig a bit deeper, we can find the CPython's implementation preventing us to hack around here.

There's another way and that's by directly utilizing the unistd.h's exec either by wrapping is as a library (e.g. with Cython or ctypes) or by passing shell=True to subprocess and calling it from a shell:

# int execl(const char *pathname, const char *arg, ..., /*, (char *) NULL */);
import ctypes
lib = ctypes.CDLL("libc.so.6")
lib.execl("./a.out", None)
import subprocess
subprocess.run(["exec", "-a", "", "./a.out"], shell=True)

it'll still create the first argv, but it'll at least be empty string unlike in CPython.

That's however, Linux specific. It might work on MacOS though. For Windows you'll need to check CreateProcess() probably the lpApplicationName parameter.


Extra: With CPython 3.10 you can access sys.orig_argv which will pull the same value that would be passed into the C code above. Notice the difference between the orig_argv and argv:

$ docker run -it -u 0:0 python:3.10 bash
root@a801e79a4264:/# exec -a "" python
Python 3.10.2 (main, Feb  8 2022, 04:44:29) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.orig_argv
[]
>>> sys.argv
['']
>>> 

As mentioned by rici, it'll be a bit problematic achieving something reasonable with the interpreter's binary, so embedding will help with that. One of the embedding types is quite simple, just utilize PyInstaller:

$ docker run -it -u 0:0 python:3.10 bash
root@149accac00c8:/# exec -a "" python main.py 
# import sys;print(sys.argv, sys.orig_argv)
['main.py'] ['', 'main.py']

vs

$ docker run -it -u 0:0 python:3.10 bash
root@21a9fed751b8:/# pip install pyinstaller
root@21a9fed751b8:/# pyinstaller --onefile main.py
root@21a9fed751b8:/# bash  # spawn subprocess to prevent exec closing the window
root@21a9fed751b8:/# exec -a "" dist/main 
[''] []
Peter Badida
  • 11,310
  • 10
  • 44
  • 90
  • 3
    You can run a subprocess with an empty argv array rather more simply using `subprocess.run([], executable="/path/to/executable")`. Your C program will report an empty list. However, a Python program won't, because `sys.argv[0]` is not `argv[0]`. (I don't know if this will work on Windows but it will certainly work on any Unix-like implementation.) – rici Feb 18 '22 at 22:03
  • @rici damn, it's the little things! :D I should have thought about that before digging through the C code. – Peter Badida Feb 18 '22 at 22:07
  • 2
    Also, I just noticed that Python 3.10 added [`sys.orig_argv`](https://docs.python.org/3/library/sys.html?highlight=argv#sys.orig_argv), which is supposedly "The list of the original command line arguments passed to the Python executable." However, that doesn't help either because the original command-line to run a Python script actually runs some Python executable, and the shebang command emulation loses the original `argv[0]`. So I think that the only way to actually see the original `argv[0]` would be to write a C program which embeds a Python interpreter. – rici Feb 18 '22 at 22:19
  • @rici I've added an example with PyInstaller to show the diff between non-embedded and embedded values. :) – Peter Badida Feb 18 '22 at 22:41