28

Is there a minimally POSIX.2 compliant shell (let's call it mpcsh) in the following sense:

if mpcsh myscript.sh behaves correctly on my (compliant) system then xsh myscript.sh will behave identically for any POSIX.2 compliant shell xsh on any compliant system. ("Identically" up to less relevant things like the wording of error messages etc.)

Does dash qualify?

If not, is there any way to verify compliance of myscript.sh?


Edit (9 years later):

The accepted answer still stands, but have a look at this blog post and the checkbashisms command (source). Avoiding bashisms is not the same as writing a POSIX.2 compliant shell script, but it comes close.

Jens
  • 69,818
  • 15
  • 125
  • 179
Hans Lub
  • 5,513
  • 1
  • 23
  • 43
  • 2
    The Debian Almquist shell (`dash`) is fairly close. – Dietrich Epp Jul 07 '12 at 17:03
  • 3
    OK, `dash` is close: compliant, and minimalistic. But is it _minimal_, i.e. can I trust that scripts that work with `dash` will work everywhere? If not, it's a case of "close, but no [cigar](http://en.wiktionary.org/wiki/close,_but_no_cigar)" – Hans Lub Jul 07 '12 at 17:28
  • The problem is you'd net an entire set of minimal utilities: minimal `cat`, minimal `grep`, minimal `dd`... unless you're writing a pure shell script, your portability is going to depend on what features you use in the entire environment. – Dietrich Epp Jul 07 '12 at 17:32
  • You are right, but there are many use cases (like `configure` scripts) where the use of `cat`, `mv`, `cp` and their ilk is simple and standard and where one still needs to write fairly complicated variable substitutions, `while` loops etc. – Hans Lub Jul 07 '12 at 17:46
  • 2
    To an extent, writing `configure` scripts is an entirely different beast since you need to work around known deficiencies in substandard shells (see http://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/autoconf-2.69/html_node/Portable-Shell.html#Portable-Shell). – Dietrich Epp Jul 07 '12 at 17:49
  • 4
    Life's too short for that: I can spend an hour working around crappix99's 666-byte limitation on the length of [here documents](http://en.wikipedia.org/wiki/Here_document), or just point out that my `configure` script is POSIX.2 compliant, now please go find a better shell... – Hans Lub Jul 07 '12 at 18:07
  • The standard itself is ambiguous. How should the shell respond to `cmd &> file`? `dash` behaves differently than `bash`, and the standard does not seem to specify the correct behavior. My personal opinion is that `dash's` behavior is conformant while `bash` is incorrect. Use `dash`. If problems arise, deal with them. Unfortunately, that's the way it is. – William Pursell Aug 09 '12 at 15:08

5 Answers5

31

The sad answer in advance

It won't help you (not as much and reliably as you would expect and want it to anyway).


Here is why.

One big problem that cannot be addressed by a virtual "POSIX shell" are things that are ambiguously worded or just not addressed in the standard, so that shells may implement things in different ways while still adhering to the standard.

Take these two examples regarding pipelines, the first of which is well known:

Example 1 - scoping

$ ksh -c 'printf "foo" | read s; echo "[${s}]"'
[foo]

$ bash -c 'printf "foo" | read s; echo "[${s}]"'
[]

ksh executes the last command of a pipe in the current shell, whereas bash executes all - including the last command - in a subshell. bash 4 introduced the lastpipe option which makes it behave like ksh:

$ bash -c 'shopt -s lastpipe; printf "foo" | read s; echo "[${s}]"'
[foo]

All of this is (debatably) according to the standard:

Additionally, each command of a multi-command pipeline is in a subshell environment; as an extension, however, any or all commands in a pipeline may be executed in the current environment.

I am not 100% certain on what they meant with extension, but based on other examples in the document it does not mean that the shell has to provide a way to switch between behavior but simply that it may, if it wishes so, implement things in this "extended way". Other people read this differently and argue about the ksh behavior being non-standards-compliant and I can see why. Not only is the wording unlucky, it is not a good idea to allow this in the first place.

In practice it doesn't really matter which behavior is correct since those are the """two big shells""" and people would think that if you don't use their extensions and only supposedly POSIX-compliant code that it will work in either, but the truth is that if you rely on one or the other behavior mentioned above your script can break in horrible ways.

Example 2 - redirection

This one I learnt about just a couple of days ago, see my answer here:

foo | bar 2>./qux | quux

Common sense and POLA tells me that when the next line of code is hit, both quux and bar should have finished running, meaning that the file ./qux is fully populated. Right? No.

POSIX states that

If the pipeline is not in the background (see Asynchronous Lists), the shell shall wait for the last command specified in the pipeline to complete, and may also wait for all commands to complete.)

May (!) wait for all commands to complete! WTH!

waits:

The shell waits for all commands in the pipeline to terminate before returning a value.

but doesn't:

Each command, except possibly the last, is run as a separate process; the shell waits for the last command to terminate.

So if you use redirection inbetween a pipe, make sure you know what you are doing since this is treated differently and can horribly break on edge cases, depending on your code.

I could give another example not related to pipelines, but I hope these two suffice.

Conclusion

Having a standard is good, continuously revising it is even better and adhering to it is great. But if the standard fails due to ambiguity or permissiveness things can still unexpectedly break practically rendering the usefulness of the standard void.

What this means in practice is that on top of writing "POSIX-compliant" code you still need to think and know what you are doing to prevent certain things from happening.

All that being said, one shell which has not yet been mentioned is posh which is supposedly POSIX plus even fewer extensions than dash has, (primarily echo -n and the local keyword) according to its manpage:

BUGS
   Any bugs in posh should be reported via the Debian BTS.
   Legitimate bugs are inconsistencies between manpage and behavior,
   and inconsistencies between behavior and Debian policy
   (currently SUSv3 compliance with the following exceptions:
   echo -n, binary -a and -o to test, local scoping).

YMMV.

Community
  • 1
  • 1
Adrian Frühwirth
  • 42,970
  • 10
  • 60
  • 71
  • 5
    My hypothetical minimal shell would necessarily omit all ambiguously worded features (no pipelines, sorry!) so ambiguitiy _per se_ is not a reason such a shell could not exist. But such a shell would be practically useless, so I accept your sad answer with tears in my eyes :-) – Hans Lub May 05 '13 at 20:18
  • 3
    I share your tears, I also wish there were such a thing :) One could strip down dash or posh to bare POSIX and add options to switch between different/buggy implementations, still better than nothing I suppose. I don't know how POSIX compliant busybox' ash is, but if that is an option we would get its other tools for free (sed, awk, grep, ...) and could also strip down those. Maybe it's time to start such a project? – Adrian Frühwirth May 05 '13 at 21:27
5

Probably the closest thing to a canonical shell is ash which is maintained by The NetBSD Foundation, among other organizations.

A downstream variant of this shell called dash is better known.

DigitalRoss
  • 143,651
  • 25
  • 248
  • 329
4

Currently, there is no single role model for the POSIX shell.

Since the original Bourne shell, the POSIX shell has adopted a number of additional features.

All of the shells that I know that implement those features also have extensions that go beyond the feature set of the POSIX shell.

For instance, POSIX allows for arithmetic expressions in the format:

var=$(( expression ))

but it does not allow the equivalent:

(( var = expression ))

supported by bash and ksh93.

I know that bash has a set -o posix option, but that will not disable any extensions.

$ set -o posix
$ (( a = 1 + 1 ))
$ echo $a
2

To the best of my knowledge, ksh93 tries to conform to POSIX out of the box, but still allows extensions.

Henk Langeveld
  • 8,088
  • 1
  • 43
  • 57
2

The POSIX developers spent years (not an exaggeration) wrestling with the question: "What does it mean for an application program to conform to the standard?" While the POSIX developers were able to define a conformance test suite for an implementation of the standards (POSIX.1 and POSIX.2), and could define the notion of a "strictly conforming application" as one which used no interface beyond the mandatory elements of the standard, they were unable to define a testing regime that would confirm that a particular application program was "strictly conforming" to POSIX.1, or that a shell script was "strictly conforming" to POSIX.2.

The original question seeks just that; a conformance test that verifies a script uses only elements of the standard which are fully specified. Alas, the standard is full of "weasel words" that loosen definitions of behavior, making such a test effectively impossible for a script of any significant level of usefulness. (This is true even setting aside the fact that shell scripts can generate and execute shell scripts, thus rendering the question of "strictly conforming" as equivalent to the Stopping Problem.)

(Full disclosure: I was a working member and committee leader within IEEE-CS TCOS, the creators of the POSIX family of standards, from 1988-1999.)

jdzions
  • 175
  • 1
  • 7
1

If not, is there any way to verify compliance of myscript.sh?

This is basically a case of Quality Assurance. Start with:

  • code review
  • unit tests (yes, I've done this)
  • functional tests
  • perform the test suite with as many different shell programs as you can find. (ash, bash, dash, ksh93, mksh, zsh)

Personally, I aim for the common set of extensions as supported by bash and ksh93. They're the oldest and most widely available interpreters of the shell language available.

EDIT Recently I happened upon rylnd/shpec - a testing framework for your shell code. You can describe features of your code in test cases, and specify how they can be verified. Disclosure: I helped making it work across bash, ksh, and dash.

Henk Langeveld
  • 8,088
  • 1
  • 43
  • 57