1

I want to split a string into command-line arguments, exactly like shlex.split does. However, shlex doesn't seem to convert environment variables (for example $USER), and the output makes it impossible to know whether the environment variable was escaped:

>>> print(shlex.split("My name is Alice"))
['My', 'name', 'is', 'Alice']
>>> print(shlex.split("My name is '$USER'"))
['My', 'name', 'is', '$USER']
>>> print(shlex.split("My name is $USER")) # expected Alice, not $USER
['My', 'name', 'is', '$USER']

Is there a way to achieve this? (hopefully without re-implementing the whole thing)

Also, why doesn't shlex.split do this by default in the first place?

If it matters, I am using Python 3.6.8.

NeatNit
  • 526
  • 1
  • 4
  • 14

1 Answers1

1

The argument passed into shlex.split() is a string.

You will have to retrieve the environment variable, using os.environ, and then concatenate it into the string, e.g.

import shlex
import os
print(shlex.split(f"My name is {os.environ['USER']}"))
# ['My', 'name', 'is', 'Alice']

If your input string is coming from a file, then you can evaluate the environment variables using os.path.expandvars():

import shlex
import os
print(shlex.split(os.path.expandvars("My name is $USER")))
# ['My', 'name', 'is', 'Alice']

If you need to account for escaped variables in the string, you can send the string off to echo in the shell using subprocess.run() with shell set to True.

This version will work in all three cases in your situation. It works regardless of how the variable is escape, e.g. slash-escaped or using quotes.

import shlex
import subprocess

strings = [
    "My name is Alice",
    "My name is '$USER'",
    "My name is \$USER",
    "My name is $USER"
]

for s in strings:
    split = subprocess.run(f'echo {s}', shell=True, stdout=subprocess.PIPE)
    print(shlex.split(split.stdout.decode('utf-8')))
# ['My', 'name', 'is', 'Alice']
# ['My', 'name', 'is', '$USER']
# ['My', 'name', 'is', '$USER']
# ['My', 'name', 'is', 'Alice']

WARNING:

Setting shell to True is dangerous. Only do this if the input string is trusted.

For example, if the string was "My name is $USER; rm file", then the file file would be removed.

costaparas
  • 5,047
  • 11
  • 16
  • 26
  • The string was produced outside of Python (it will be read from a file), which is why the conversion will have to be made while parsing/splitting it. – NeatNit Jan 13 '21 at 12:00
  • 1
    this also expands vars that should be escaped (e.g. second example I gave, with `'$USER'`) – NeatNit Jan 13 '21 at 12:24
  • @neatnit check if the update above works for you – costaparas Jan 13 '21 at 13:38
  • Thanks, but I'm afraid `shell=True` isn't a good idea at all in my case. It *would* work though. – NeatNit Jan 13 '21 at 14:20
  • You may have to do some parsing in such a case. Could be simplified if you know the format of the input (i.e. if it follows a consistent structure). – costaparas Jan 14 '21 at 06:30
  • 1
    In the end I did use `os.path.expandvars` like you suggested - even though it's not quite as robust as I had hoped. It will almost definitely work in my case. Thanks for that! Still, I'm not sure if I can accept this answer as it doesn't tick all the boxes in my question.. What do you think? – NeatNit Jan 14 '21 at 12:50
  • Great, glad one of my suggestions helped. – costaparas Jan 14 '21 at 13:05