0

I'm writing a POSIX compliant script in dash so I am having to get creative with using fake arrays.

Contents of fake_array.sh

fake_array_job() {
array="$1"
job_name="$2"

comma_count="$(echo "$array" | grep -o -F ',' | wc -l)"

if [ "$comma_count" -lt '1' ]; then
    echo 'You gave a fake array to fake_array_job that does not contain at least one comma. Exiting...'
    exit
fi

array_count="$(( comma_count + 1 ))"

position=1
while [ "$position" -le "$array_count" ]; do
    item="$(echo "$array" | cut -d ',' -f "$position")"

    "$job_name" || exit

    position="$(( position + 1 ))"
done
}

Contents of script.sh

#!/bin/sh

. fake_array.sh

job_to_do() {
    echo "$item"
}
fake_array_job 'goat,pig,sheep' 'job_to_do'

second_job() {
    echo "$item"
}
fake_array_job 'apple,orange' 'second_job'

I am aware that it may seem silly to use a unique name for each job I pass to fake_array_job, but I like that I have to type it twice because it helps to reduce human error.

I keep reading that it is a bad idea to use a variable as a command. Does my use of "$job_name" to run a function have any negative implications as it concerns stability, security or efficiency?

Harold Fischer
  • 279
  • 1
  • 9
  • Do you know the value of `IFS`, or can that potentially be changed by code outside your library? – Charles Duffy Apr 05 '18 at 17:53
  • BTW, `source` itself is a bashism, so your code already won't work with a baseline-POSIX `/bin/sh`. – Charles Duffy Apr 05 '18 at 17:53
  • `grep -o` [isn't portable either](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/grep.html). – Charles Duffy Apr 05 '18 at 17:55
  • I am pretty new to scripting. Does the value of IFS need to be set in a script or is it a global constant? – Harold Fischer Apr 05 '18 at 17:55
  • There's a default (which splits only on newlines, tabs and spaces). – Charles Duffy Apr 05 '18 at 17:55
  • Anyhow -- since you're *quoting* `"$job_name"`, your code is much less unsafe than it could be (if it were being expanded unquoted); the value of IFS doesn't actually make a difference with the quotes in place, since your string is parsed as only containing the name of a single command to run with no arguments. If that's what you want, you're on pretty solid ground. – Charles Duffy Apr 05 '18 at 17:58
  • 2
    `fake_array_job () { cmd=$1; shift; for arg; "$cmd" "$arg"; done; }` with `job_to_do () { echo "$1"; }` and `second_job () { echo "$1"; }` seems far simpler; use it as `fake_array_job job_to_do goat pig sheep`. There's nothing wrong with storing the *name* of a command in a variable, just not a command and its arbitrary arguments. – chepner Apr 05 '18 at 17:59
  • I echoed IFS into a file and opened it in vim; it it seems to contain a space, a tab, and two newlines; actually I forget that echo adds a newlines... – Harold Fischer Apr 05 '18 at 17:59
  • @HaroldFischer The second newline comes from `echo` itself, not the value of `IFS`. – chepner Apr 05 '18 at 17:59
  • The second newline presumably came from `echo`. Generally speaking, echo can't be trusted not to modify what you're using it to inspect; safer to `printf '%s' "$IFS" | od`, or such; see the APPLICATION USAGE and RATIONALE sections in http://pubs.opengroup.org/onlinepubs/9699919799/utilities/echo.html – Charles Duffy Apr 05 '18 at 17:59
  • @chepner I do take your meaning, but the purpose of this is to be able to iterate through a fake array, the echo "$item" bit was just for the purpose of demonstration – Harold Fischer Apr 05 '18 at 18:02
  • `while [ -n "$var" ]; do case $var in *,*) next=${var%%,*}; var=${var#*,}; "$cmd" "$next";; *) "$cmd" "$var"; break;; esac; done` -- much more efficient to do the string manipulation internal to the shell and not involve `cut`. – Charles Duffy Apr 05 '18 at 18:06
  • @Charles Duffy Did not realize that about the -o option for grep. Will have to find another way to count the commas. I'm kind of surprised shellcheck missed that, it's done a pretty good job of making me more POSIX-aware – Harold Fischer Apr 05 '18 at 18:09
  • shellcheck's good about telling you when you're using non-POSIX syntax for the shell itself with a #!/bin/sh shebang, but it doesn't check non-shell utilities' usage for extensions (since someone could intentionally write a script for /bin/sh but have it depend on GNU-only tools). – Charles Duffy Apr 05 '18 at 18:22
  • @Charles Duffy Any chance you know of a tool that checks for GNUisms? – Harold Fischer Apr 05 '18 at 18:25
  • @HaroldFischer Not a code-checking tool, but http://pubs.opengroup.org/onlinepubs/9699919799/idx/utilities.html provides links to the specification for each standard tool. If you're using a tool, stick to what's documented there. – chepner Apr 05 '18 at 18:30

1 Answers1

1

(Read to the end for a good suggestion by Charles Duffy. I'm too lazy to completely rewrite my answer to mention it earlier...)


You can iterate over the "array" using simple parameter expansions without requiring multiple elements in the array.

fake_array_job() {
    args=${1%,},   # Ensure the array ends with a comma
    job_name=$2

    while [ -n "$args" ]; do
        item=${args%%,*}
        "$job_name" || exit
        args=${args#*,}
    done 
}

One problem with the above is that assures that the array is comma-terminated by assuming that foo,bar, is not a comma-delimited array with an empty last element. A better (though uglier) solution is to use read to break up the array.

fake_array_job () {
  args=$1
  job_name=$2
  rest=$args
  while [ -n "$rest" ]; do
    IFS=, read -r item rest <<EOF
$rest
EOF
    "$job_name" || exit
  done  
}

(You can use <<-EOF and make sure the here doc is indented with tabs, but it's hard to convey that here, so I'll just leave the ugly version.)

There's also Charles Duffy's good suggestion of using case to pattern match on the array to see if there are any commas left or not:

while [ -n "$args" ]; do
  case $var in 
    *,*) next=${args%%,*}; var=${args#*,}; "$cmd" "$next";;
      *) "$cmd" "$var"; break;; 
  esac;
done
chepner
  • 497,756
  • 71
  • 530
  • 681
  • I missed Charles Duffy's similar comment while typing this up. – chepner Apr 05 '18 at 18:14
  • 1
    I *do* think your suggestion of making it `"$job_name" "$item"` (in the comments) was wise; that's a clearer calling convention, not relying on global variables being passed around. – Charles Duffy Apr 05 '18 at 18:16
  • Yeah, for the answer I'm just sticking to making the comma-delimited string simpler to work with. – chepner Apr 05 '18 at 18:27
  • I must be doing something stupid because I keep getting trapped in a loop when I give the function 'kitten,meow,purr' as the first argument – Harold Fischer Apr 05 '18 at 20:09
  • No, that's my fault. My code assumes that each element in the array is *terminated* by a comma, rather than *separated* by commas. This means in the end, when `$args == purr`, that `args=${args#*,}` won't match and `$args` stays `purr` forever. – chepner Apr 05 '18 at 20:14
  • Use `IFS=, read head tail` would not have the problem, but I want to avoid the here document it requires for aesthetic purposes. I'll add it to the answer. – chepner Apr 05 '18 at 20:17
  • 1
    The `case` approach to check for a comma used in my comment would also work around that issue. :) – Charles Duffy Apr 05 '18 at 20:20
  • When using the case $var in ... esac, is it OK to put $var in double quotes? – Harold Fischer Apr 06 '18 at 15:53
  • Yes; I tend to leave things unquoted when not strictly required. – chepner Apr 06 '18 at 16:00