Bash has the command substitution syntax $(f)
, which allows to capture
the STDOUT of a command f
. If the command is an executable, this is fine
– the creation of a new process is necessary anyway. But if the command is
a shell-function, using this syntax creates an overhead of about 25ms for
each subshell on my system. This is enough to add up to noticable delays
when used in inner loops, especially in interactive contexts such as
command completions or $PS1
.
A common optimization is to use global variables instead [1] for returning values, but it comes at a cost to readability: The intent becomes less clear, and output capturing suddenly is inconsistent between shell functions and executables. I am adding a comparison of options and their weaknesses below.
In order to get a consistent, reliable syntax, I was wondering if bash has any feature that allows to capture shell-function and executable output alike, while avoiding subshells for shell-functions.
Ideally, a solution would also contain a more efficient alternative to executing multiple commands in a subshell, which allows more cleanly isolating concerns, e.g.
person=$(
db_handler=$(database_connect) # avoids leaking the variable
query $db_handler lastname # outside it's required
echo ", " # scope.
query $db_handler firstname
database_close $db_handler
)
Such a construct allows the reader of the code to ignore everything inside $()
, if the details of how $person
is formatted aren't interesting to them.
Comparison of Options
1. With command substitution
person="$(get lastname), $(get firstname)"
Slow, but readable and consistent: It doesn't matter to the reader at first
glance whether get
is a shell function or an executable.
2. With same global variable for all functions
get lastname
person="$R, "
get firstname
person+="$R"
Obscures what $person
is supposed to contain. Alternatively,
get lastname
local lastname="$R"
get firstname
local firstname="$R"
person="$lastname, $firstname"
but that's very verbose.
3. With different global variable for each function
get_lastname
get_firstname
person="$lastname $firstname"
- More readable assignment, but
- If some function is invoked twice, we're back to (2).
- The side-effect of setting the variable is not obvious.
- It is easy to use the wrong variable by accident.
4. With global variable, whose name is passed as argument
get LN lastname
get FN firstname
person="$LN, $FN"
- More readable, allows multiple return values easily.
- Still inconsistent with capturing output from executables.
Note: Assignment to dynamic variable names should be done with
declare
rather thaneval
:$VARNAME="$LOCALVALUE" # doesn't work. declare -g "$VARNAME=$LOCALVALUE" # will work. eval "$VARNAME='$LOCALVALUE'" # doesn't work for *arbitrary* values. eval "$VARNAME=$(printf %q "$LOCALVALUE")" # doesn't avoid a subshell afterall.