16

I'm contemplating to make all bash scripts of a large codebase shellcheck compliant, but the task is overwhelming, because too many developers have historically ignored rule number one of all shell scripting: always use quotes.

It would be helpful if there was a tool that could fix at least the quoting. I would then be able to fix the rest by hand. My regex didn't cut it, because only variables not already in a string must be quoted.

Sample input:

echo "Removing $a ${b} $(c $(c)) `d $d` ${10} $@ now"
rm -rf $a ${b} $(c $(c)) `d $d` ${10} $@

Sample output:

echo "Removing $a $b $(c "$(c)") `d "$d"` ${10} $@ now"
rm -rf "$a" "$b" "$(c "$(c)")" "$(d "$d")" "${10}" "$@"

It doesn't have to fix all the above, and it doesn't even have to be flawless (though that would be really nice), but it has to be right more often than not to be useful.


Here is my naïve regex that didn't cut it:

s:([^"])\$\{([_A-Za-z0-9]+)\}([^"]|$):\1"\$\2"\3:g

It transforms ${identifier} to "$identifier", except when immediately preceded or followed by a quote, but fails to detect if we are deeper within the string.

user2394284
  • 5,520
  • 4
  • 32
  • 38
  • 3
    "My regex didn't cut it." Can you show what you tried and which cases it failed for? This is a noble endeavour btw. I would be interested in solutions. – Chem-man17 Dec 12 '16 at 15:35
  • 2
    You basically need a `bash` parser to determine which parameter expansions are quoted and which are not. Regular expressions are not sufficient for such parsing. – chepner Dec 12 '16 at 16:11
  • @chepner - Indeed, especially since `bash` is not a [regular language](https://en.wikipedia.org/wiki/Regular_language). – antiduh Dec 12 '16 at 16:38
  • 2
    Not only do you pretty much need a parser for the shell language, that parser needs to be magical. Only that way can it divine which variable references *should* be quoted, for there may be cases where it is intentional that a variable reference can expand to multiple shell words. Quoting such a variable reference would break that. – John Bollinger Dec 12 '16 at 16:51
  • 1
    *There may be cases where it is intentional* Only once in my life have I encountered that. We have code review, so don't worry ;-) – user2394284 Dec 12 '16 at 16:53
  • *You basically need a bash parser.* Shellcheck is a pretty good bash parser in my experience, and it already knows what needs to be done! – user2394284 Dec 12 '16 at 17:01
  • I don't think it is feasible to modify your entire codebase since it requires a complex parser. What I would do instead is to enable shellcheck on the scripts that are modified to take care of requirements like enhancements, bug fixes etc., and leave the rest of them as they are! – codeforester Dec 12 '16 at 18:17
  • I don't see how this could be safely automated -- specially when dealing with a lot of code from different developers. – Jamil Said Dec 12 '16 at 20:18
  • There may be cases where it is intentional: `files="a b c"; touch ${files}; rm ${files}` is different from `files="a b c"; touch "${files}"; rm "${files}"`. – Walter A Dec 12 '16 at 23:03
  • You'll never do it with simple regular expressions search&replace. Note that `"$var1 $var2"` doesn't need to be replaced. It needs to be a program, an existing one (like shellcheck) or a new very simple one. It doesn't need to be a complete bash parser, because it will be totally focused in shell variables analysis. – Wilfredo Pomier Dec 13 '16 at 00:35
  • 1
    @WalterA, those would need to be fixed as well, but with `files=(a b c); touch -- "${files[@]}"; rm -f -- "${files[@]}"`. One of the cases for a code-review as that can't really be automated. – Stephane Chazelas Aug 10 '17 at 12:30

2 Answers2

16

WPomier beat me to it, but I did my own as well (because I wanted to):
https://github.com/anordal/shellharden

It acts as a syntax highlighter, until you give it the --transform option.

user2394284
  • 5,520
  • 4
  • 32
  • 38
7

This is not an existent tool, but a little program in C, that it can help you as a base to get what you want.

You can see it here.

Example:

$ cat script.sh
echo "Removing $a ${b} $(c $(c)) `d $d` ${10} $@ now"
rm -rf $a ${b} $(c $(c)) `d $d` ${10} $@

$ checkshellvar < script.sh
echo "Removing $a ${b} $(c $(c)) `d $d` ${10} $@ now"
rm -rf "$a" "${b}" "$(c "$(c)")" "$(d "$d")" "${10}" "$@"

Disclaimer: The program achieves your sample output, but I did it in my coffee break, so don't expect too much ;-)

Note: Despite this program, I totally belieave that the quotes in shell scripting has a meaning, and their absence or the use of single or double quotes is perfectly valid depending on circumstances.

Wilfredo Pomier
  • 1,091
  • 9
  • 12
  • The first commit is a bit crashy :-( – user2394284 Dec 13 '16 at 21:47
  • I told you I did it in the kitchen :-) I added support to one argument (filename) in order to debug it. I tested against many of my own scripts and it appears run fine now. – Wilfredo Pomier Dec 14 '16 at 01:14
  • Exactly what I asked for. Not perfect, but useful with the right grain of salt. You mentioned that heredocs are not supported, so I'll treat those files manually. – user2394284 Dec 16 '16 at 10:52