4

Given a string representation of an arbitrary Bash "simple command", how can I split it into an array containing its individual "parts", i.e. the command name and individual parameters, just like the the shell itself (i.e. Readline) would split it when parsing it and deciding which executable/function to run and which parameters to pass it?


My specific use-case is needing to parse user-defined alias definitions. E.g. an alias might be defined as:

alias c2="cut -d' ' -f2"  # just an example... arbitrary commands should be handled!

And this is how my bash script would try to parse it:

alias_name="c2"
alias_definition=$(alias -p | grep "^alias $alias_name=") # "alias c2='cut -d'\'' '\'' -f2'"
alias_command=${alias_definition##alias $alias_name=}     # "'cut -d'\'' '\'' -f2'"
alias_command=$(eval "echo $alias_command")               # "cut -d' ' -f2"

alias_parts=($alias_command) # WRONG - SPLITS AT EVERY WHITESPACE!

echo "command name: ${alias_parts[0]}"

for (( i=1; i <= ${#alias_parts}; i++ )); do
  echo "parameter $i : ${alias_parts[$i]}"
done

Output:

command name: cut
parameter 1 : -d'
parameter 2 : '
parameter 3 : -f2

Desired output:

command name: cut
argument 1  : -d' '
argument 2  : -f2


What would I need to replace the alias_parts=($alias_command) line with, to achieve this?

smls
  • 5,738
  • 24
  • 29

5 Answers5

3

as l0b0 said, it's not readline. It's the shell itself doing the splitting. So use the shell itself to do the parsing.

alias c2="cut -d' ' -f2"

split_parts() {
    alias_parts=("$@")
}

alias_defn=$(alias c2)
# 2 evals needed to get rid of quotes
eval eval split_parts ${alias_defn#alias c2=}

for (( i=0; i < ${#alias_parts}; i++ )); do
  echo "parameter $i : \"${alias_parts[$i]}\""
done

outputs

parameter 0 : "cut"
parameter 1 : "-d "
parameter 2 : "-f2"

Note that the -d includes the trailing space that the shell actually sees.

evil otto
  • 10,348
  • 25
  • 38
  • "as l0b0 said, it's not readline." - I thought is was Readline because of the following sentence about the [COMP_WORDS](http://www.gnu.org/software/bash/manual/bashref.html#Bash-Variables) array (which happens to contain the line split in such a way) in the Bash reference: "The line is split into words as Readline would split it". – smls Jan 21 '12 at 00:52
  • That's part of programmable completion, which _is_ done by readline. The shell's parsing of commands is independent of readline, based more on IFS (as some other poster pointed out). readline is just a fancy library for handling line editing. – evil otto Jan 21 '12 at 01:36
2

To minimalize "evil otto's" solution:

alias c2="cut -d' ' -f2"
alias_definition=$(alias c2)
eval eval alias_parts=( "${alias_definition#alias c2=}" )

You can use `declare -p' to do a quick array print:

$ declare -p alias_parts
declare -a alias_parts='([0]="cut" [1]="-d " [2]="-f2")'

Also useful may be `printf %q' to quote an argument "in a way that can be reused as shell input" (from: help printf):

$ printf %q ${alias_parts[1]}
-d\

Freddy Vulto
http://fvue.nl/wiki/Bash

fvue
  • 316
  • 2
  • 5
  • I needed to put the second parameter of the printf in quotes to make it work like that: `printf %q "${alias_parts[1]}"` – smls Jan 24 '12 at 13:21
  • Btw, is there a definitive advantage of this method over the "eval set --" method proposed by @tripleee? – smls Jan 24 '12 at 13:23
1

That's not readline splitting, it's getopt or getopts. For example:

params="$(getopt -o d:h -l directory:,help --name "$0" -- "$@")"

eval set -- "$params"
unset params

while true
do
    case "${1-}" in
        -d|--directory)
            directory="$2"
            shift 2
            ;;
        -h|--help)
            usage
            exit
            ;;
        --)
            shift
            if [ "${1+defined}" = defined ]
            then
                usage
            fi
            break
            ;;
        *)
            usage
            ;;
    esac
done
l0b0
  • 55,365
  • 30
  • 138
  • 223
  • 1
    No, that's no what I mean. Even without getopt/getopts, you can e.g. call a bash script like `./test.sh a b 'c d'`, and inside the script the parameter $3 will be set to 'c d'. That's the kind of splitting I need, except I don't need it for script parameters, but rather to manually apply it to a string saved in a variable. – smls Jan 20 '12 at 13:24
1

The set built-in can be used for splitting strings.

bash$ set -- cut -d ' ' -f2

bash$ echo "'$3'"
' '

Edit: If the string you want to split is already in a variable, that's a lot trickier. You might play around with eval but in this case I'd say that complicates things, rather than simplifies them.

bash$ a="cut -d ' ' -f2"

bash$ eval set -- $a  # No quoting!

bash$ echo "'$3'"
' '
tripleee
  • 175,061
  • 34
  • 275
  • 318
  • This is close to what I want. However, in addition to splitting the arguments, it also seems to perform string expansion on the individual arguments, so instead of the array `[cut,-d,' ',-f2]` you get the array `[cut,-d, ,-f2]` (with the quotes removed in the third term). Is is somehow possible to perform just the splitting step on its own, to preserve the raw parameters? – smls Jan 20 '12 at 13:42
  • 1
    No, I don't think so. I would selectively add quotes where they are required for displaying to the user, but they aren't properly part of the command, they only serve to escape spaces etc from the shell. In other words, `[cut,-d, ,-f2]` is exactly what you need. – tripleee Jan 20 '12 at 15:24
  • ... But yes, if your alias contains e.g. an unquoted wildcard, that needs to be escaped before you pass it to `eval`. – tripleee Jan 20 '12 at 18:50
  • Unfortunately, in my case `[cut,-d,' ',-f2]` is really what I need, because I'm trying to add the individual parameters to the COMP_WORDS array inside a custom bash completion function before calling a pre-existing bash completion function (specifically, the one defined for the alias'ed command), and I would like to add them to the array in the same way that the shell itself would add them if the full command contained in the alias would be TAB-completed directly. And that happens to be without expanding any strings or escapes, but rather just split at parameter boundaries. – smls Jan 21 '12 at 00:48
  • 1
    No, you misunderstand. The quotes are not part of the value, they are there to prevent it from being substituted, but once you have a space in a variable, the variable's value itself does not (and should not) include the quotes. – tripleee Jan 27 '12 at 11:44
  • Yes, but the tab-completion system is a special case. Whenever calling a completion function, the shell fills the COMP_LINE variable containing the command line exactly as is has been typed in, and the COMP_WORDS array containing the exact same line, split into fields. The fields do not contain the logical value of each parameter, but rather the corresponding fragment of the command line exactly as it has been typed in. So in order to not break completion functions which expect this behavior, it should be mimicked exactly when manually modifying these variables. – smls Jan 28 '12 at 13:17
0

If we put each of alias_command's arguments on its own line, and then (locally) set IFS=\n, we're done:

parsealias ()
{
   alias_command_spaces=$(eval "echo $(alias $1)" | sed -e "s/alias $1=//") # "cut -d' ' -f2"
   alias_command_nl=$(eval each_arg_on_new_line $alias_command_spaces)      # "cut\n-d' '\n-f2"
   local IFS=$'\n' # split on newlines, not on spaces
   alias_parts=($alias_command_nl) # each line becomes an array element, just what we need
   # now do useful things with alias_parts ....
}

Now we only need to write the command each_arg_on_new_line used above, e.g:

#!/usr/bin/env perl

foreach (@ARGV) {
  s/(\s+)/'$1'/g; # put spaces whithin quotes
  print "$_\n";
}
Hans Lub
  • 5,513
  • 1
  • 23
  • 43