1

I am having some trouble with the subprocess module. I would like the module to run the shell command equivalent to 'ls -l "/path/to/file/with possible space in directory/or with space in name"'. Subprocess works fine when the filename is not a variable. If the filename is a variable that contains the quotes, then it doesn't work.

Code that doesn't work:

import subprocess

archive_file_list = "/var/tmp/list"
archive = open(archive_file_list, "r")

for line in archive:
    noreturnline = line[:-1]
    quotedline = "\"" + noreturnline + "\""
    if extension == "zip":
        print quotedline
        archivelist = subprocess.check_output(['ls', '-l', quotedline])
        print archivelist

Code that works:

archivelist = subprocess.check_output(['ls', '-l', "/path/to/file/with possible space in directory/or with space in name"])

Here is the output for the code that doesn't work:

"/path/to/file/with possible space in directory/or with space in name"
ls: cannot access "/path/to/file/with possible space in directory/or with space in name" No such file or directory
Traceback (most recent call last):
File "./archive_test.py", line 12, in <module>
archivelist = subprocess.check_output(['ls', '-l', quotedline])
File "/usr/lib64/python2.7/subprocess.py", line 575, in check_output
raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['ls', '-l', '"/path/to/file/with possible space in directory/or with space in name"']' returned non-zero exit status 2

Before you ask - yes, I have already verified that "/path/to/file/with possible space in directory/or with space in name" does in fact exist by running 'ls -l' from the command line.

Any help would be appreciated. Thanks in advance.

user3155618
  • 359
  • 1
  • 5
  • 14
  • 1
    Don't add quotes, the `subprocess` module will escape your whitespaced strings the proper way for the underlying shell (as escape patterns might differ). – zwer Jun 07 '17 at 20:42
  • 1
    The quotes here are for the benefit of the *shell*, not for the benefit of `ls`. If you don't have a shell, you shouldn't have any literal quotes. – Charles Duffy Jun 07 '17 at 20:46
  • @CharlesDuffy Won't it read it as escaped backspaces? – cs95 Jun 07 '17 at 20:48
  • 1
    @Shiva, no, because it's the shell that's responsible for processing escapes, not `ls`. In the case here, there *is* no shell. – Charles Duffy Jun 07 '17 at 20:48

1 Answers1

3

in the first command (which is the best option there is):

archivelist = subprocess.check_output(['ls', '-l', "/path/to/file/with possible space in directory/or with space in name"])

the third argument is actually /path/to/file/with possible space in directory/or with space in name (without quotes) which is the filename that exists, and the command works.

Since shell=True isn't even set, the command is directly passed to exec, with the arguments passed as-is: the spaces & other chars are preserved.

If you add more quotes, they're not removed and they're passed literally to ls.

Since there's no such file called "/path/to/file/with possible space in directory/or with space in name" (with quotes), the file/dir isn't found.

There's another (dirty) way of calling a command: passing the full command as a string (not as a list of parameters). In that case, that would work (without shell=True at least on Windows, subprocess seems to handle the argument splitting, shell=True seems to be required on Unix-like systems):

subprocess.check_output('ls -l "/path/to/file/with possible space in directory/or with space in name"')

but your first approach is cleaner, specially if you don't know the directory name because it's a parameter. Let subprocess do the heavy lifting for you.

On Unix-like systems, using this last approach requires shell=True, but then you're exposing your program to malicious attacks like any open system call (appending ;rm -rf / to the filename, evaluating sub-shells for instance)

Final note: if you're really planning to use ls and parse its output, don't do it (http://mywiki.wooledge.org/ParsingLs), use standard os.listdir, os.path.getsize/getmtime & os.stat calls to get the information you need.

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
  • ...first approach is cleaner, *and* more secure (if you're handling untrusted filenames, you don't need one with `$(rm -rf ~)'$(rm -rf ~)'` in its contents to cause havoc). – Charles Duffy Jun 07 '17 at 20:50
  • @CharlesDuffy I don't know the expression you're talking about but terminating the command with `;` then chaining another command would be a problem too. but anyway, If `shell=True` isn't set that's still safe right? – Jean-François Fabre Jun 07 '17 at 20:51
  • Quite right -- a file named `/path/to/file/"; rm -rf ~; : "` would indeed cause trouble too. And also quite right. (I was expanding on your last sentence, talking about how the `shell=False` approach is more secure). – Charles Duffy Jun 07 '17 at 20:52
  • but only if `shell=True`. Otherwise it's not interpreted (testing on MS-DOS/Windows with `&&` confirms this) – Jean-François Fabre Jun 07 '17 at 20:53
  • BTW, on UNIXy systems, I can't reproduce the stated behavior that passing a string will work with multiple arguments when `shell=False` (as is default). – Charles Duffy Jun 07 '17 at 20:54
  • `subprocess.check_output("echo 'hello world'")` gives me a "No such file or directory" (as I'd expect it to), as opposed to the claim that "`subprocess` handles the argument splitting" "without `shell=True`". – Charles Duffy Jun 07 '17 at 20:56
  • wouldn't be because `echo` is built-in? (I know there's a non-built-in echo as well, but it's a good lead). `cd` works only with `shell=True` – Jean-François Fabre Jun 07 '17 at 20:56
  • Rather, that would be become `subprocess` *doesn't* actually handle argument splitting (at least on Unixlikes -- Windows is a foreign land to me), so the above is looking for a single file with a name containing quotes and spaces, akin to `/usr/bin/echo 'hello world'` or such. – Charles Duffy Jun 07 '17 at 20:57
  • @CharlesDuffy that's an interesting difference then. Because `subprocess.check_output('ls -l "/path/to/file/with possible space in directory/or with space in name"')` works fine on Windows (provided that `ls` is installed from MSYS or Cygwin. Let me edit my answer. – Jean-François Fabre Jun 07 '17 at 20:59
  • Can't say I'm surprised -- Windows actually *does* use strings for its calling convention, after all, whereas on UNIX the list-of-C-strings approach is plumbed all the way down. – Charles Duffy Jun 07 '17 at 21:00
  • @Jean Francois Fabre, thank you so much for solving this issue AND explaining it so I understand. – user3155618 Jun 07 '17 at 21:00
  • @CharlesDuffy thank you so much for solving this issue AND explaining it so I understand. – user3155618 Jun 07 '17 at 21:01
  • 1
    @CharlesDuffy I cannot let that one pass: "don't parse the output of `ls`" :) – Jean-François Fabre Jun 07 '17 at 21:04
  • @user3155618 you're welcome, but I hope you're not actually using `ls` in your program, because there are other cleaner ways to get file information from python without using system calls. – Jean-François Fabre Jun 07 '17 at 21:04
  • @Jean Francois Fabre, I am only using ls for testing purposes. The ultimate purpose is to run commands for the appropriate type of file. The files are all tar, tar.gz, tar.Z, zip files, and I want to list out the contents of each file. I tested using ls. I don't *think* python has archive listing tools, so I will likely need to do system calls. – user3155618 Jun 07 '17 at 21:10
  • 1
    @user3155618: you're wrong. Python can handle zip & tar & gzip files natively. For `.Z` you may need to run `compress` but the rest is OK. Check into `zipfile` & `tarfile` & `gzfile` modules. one Q&A at random (there are hundreds of those): https://stackoverflow.com/questions/8176953/python-zipfile-path-separators – Jean-François Fabre Jun 07 '17 at 21:11