4

I'm writing a shell, and I'm a little confused by the POSIX shell specification. Say I have the command:

echo "`echo "a\\
b"`"

Should the shell output

ab

or

a\
b

?

In other words, are line continuations removed again after removing the escaping from the text in a command substitution? The POSIX specification appears to specify that line-continuation removal will not happen again, however, all the shells I tested (bash, dash, and busybox's ash) run line-continuation removal again, causing the test script to output ab.

Script explanation:

The part of the script that's inside the command-substitution is un-escaped, producing:

echo "a\
b"

Now, if line-continuation removal is run again, it will remove the backslash-newline pair, producing the command echo "ab" inside the command-substitution, otherwise the backslash-newline pair will still be between the a and b.

programmerjake
  • 1,794
  • 11
  • 15
  • 1
    you can run bash in posix mode by `bash --posix` and see how that treats the command – Ian Kenney May 08 '17 at 00:16
  • 1
    @IanKenney bash in posix mode produces an identical result to non-posix mode – programmerjake May 08 '17 at 00:17
  • I know in the example that you gave, but with the --posix flag set, bash is supposed to be more posix compliant than without. I Thought it may be useful for you to see how existing shells interpret the specification – Ian Kenney May 08 '17 at 00:24
  • 1
    @IanKenney ok. I mentioned that i had tested in several different shells. one of the shells I tested (dash) is designed specifically to implement no more or less than the posix spec. – programmerjake May 08 '17 at 00:28
  • @programmerjake that's true. I've checked similar cases and it looks for me like bash work with that code in that way: if you're using gravemarks – first ``\`` is used to escape ``\n`` while writing, second – while invoking. If you write it without ``\\`` – you'll get ``a\nb``. (I didn't say that makes sense...) – Sylogista May 08 '17 at 00:36
  • 2
    [Check man bash (quoting)](https://linux.die.net/man/1/bash). It's interesting... – Sylogista May 08 '17 at 00:41
  • 1
    `$( )` is perfectly valid in POSIX sh. It's only pre-POSIX Bourne that doesn't have it -- and backslash handling is **much** simpler in `$()`. Which is to say, I'd suggest amending the title to specify "backticks", if that's the whole of where you have questions. – Charles Duffy May 08 '17 at 17:06
  • 1
    There's only _one_ line continuation (the embedded `"..."` string _by itself_ doesn't contain one, due to the `\ ` preceding the newline being escaped as `\\ `), and it stems from the shell interpreting the `\\ ` as single `\ ` _before_ parsing and executing the embedded command, due to use of `\`...\`` (rather than `$(...)`). – mklement0 May 08 '17 at 18:02

1 Answers1

2
  • Old-style `...` command substitutions subject the embedded command to prior interpretation of \ as an escape character, and only then parse and execute it.

    Within the backquoted style of command substitution, \ shall retain its literal meaning, except when followed by: $, `, or \.

    • In other words: any embedded \$, \`, and \\ sequences are treated as escape sequences whose 2nd character should be treated literally.

    • Thus, \\<newline> in your command is reduced to \<newline>, because `...` interprets the \\ as an escaped, literal \

    • This interpretation happens before the embedded command is parsed and executed.

    • The \<newline> in the resulting command is therefore interpreted as a line continuation (inside the double-quoted string), which effectively removes the newline.

    • Therefore, the double-quoted string is effectively parsed as literal ab, and that is what is passed to the inner echo call.

    • In bash, you can verify this processing by setting debugging options: set -xv

  • Modern syntax $(...) avoids such surprises by providing a truly independent quoting context.

    Because of these inconsistent behaviors, the backquoted variety of command substitution is not recommended for new applications that nest command substitutions or attempt to embed complex scripts.

    • With $(...), the escaped line continuation in the embedded double-quoted string is retained (in bash, dash, ksh and zsh):

      echo "$(echo "a\\
      b")"
      
      # Output
      a\
      b         
      
    • Another reason to prefer $(...) is that it works the same in bash, dash, ksh and zsh, which is not true of `...`, whose behavior differs in ksh (see below).


Compliance in major POSIX-like shells - bash, dash, ksh, zsh

  • In ksh (verified with version 93u+), your command breaks, because ksh requires embedded " chars. inside `...` to be escaped as \" - which is a deviation from the standard.
    Syntax $(...) does not have this requirement.

  • bash, dash, and zsh process your `...`-based command as required by the spec (in the case of bash, whether or not it is run in POSIX-compatibility mode).

    • Note that these shells also support \"-escaped as double quotes inside `...` as ksh requires.
    • Arguably, supporting this is a deviation from the standard, given that " is not among the characters that form an escape sequence when preceded by \ in the context of `...`; e.g., echo "`echo \"a b\"`" should result in "a b", not a b.

Optional reading: cross-shell testing

If you find yourself needing to compare the behavior of POSIX-like shells frequently, consider use of shall, my CLI and REPL for invoking shell scripts or commands with multiple POSIX-like shells.

By default, it targets bash, dash, ksh, and zsh (whichever ones are installed).

If you put your command in script ./tst, for instance, you would invoke shall as follows:

shall ./tst

which yields something like:

<code>shall</code> sample output

Note how invocation with ksh failed, because ksh requires " inside a `...` command substitution to be escaped as \".
Again, using $(...) would bypass this problem.

Installation of shall from the npm registry (Linux and macOS)

Note: Even if you don't use Node.js, npm, its package manager, works across platforms and is easy to install; try
curl -L https://git.io/n-install | bash

With Node.js installed, install as follows:

[sudo] npm install shall -g

Note:

  • Whether you need sudo depends on how you installed Node.js and whether you've changed permissions later; if you get an EACCES error, try again with sudo.
  • The -g ensures global installation and is needed to put shall in your system's $PATH.

Manual installation (any Unix platform with bash)

  • Download this bash script as shall.
  • Make it executable with chmod +x shall.
  • Move it or symlink it to a folder in your $PATH, such as /usr/local/bin (macOS) or /usr/bin (Linux).
mklement0
  • 382,024
  • 64
  • 607
  • 775
  • 1
    i was missing that the unescaped content goes through all the steps of parsing again, the first step after splitting into lines is removing line-continuations. – programmerjake May 08 '17 at 19:25
  • @programmerjake: Got it - I hadn't even questioned that aspect; that the embedded command is parsed again is clearly how it works in practice, but it's not obvious to me from the spec. – mklement0 May 08 '17 at 19:38