3

Suppose I have the following list of files:

/aaa/bbb/file1.txt
/aaa/ccc/file2.txt
/aaa/bbb/file3.txt
/aaa/bbb/file4.txt
/aaa/ccc/file5.txt

And I'd like to have a set of all of the unique dirnames in an array. The resulting array would look something like this:

dirs=( "/aaa/bbb" "/aaa/ccc" )

I think I can do something like this, but it feels really verbose (pardon the syntax errors, i don't have a shell handy):

dirs=()
for f in filelist do
    dir=$(dirname $f)
    i=0
    while [$i -lt ${#dirs[@]} ]; do
        if [ dirs[$i] == $dir ]
            break
        fi
        i=$[i + 1]
    done
    if [ $i -eq ${dirs[@]} ]
        dirs+=($dir)
    fi
 done
awm129
  • 305
  • 2
  • 11

1 Answers1

2

Use associative arrays:

declare -A dirs

for f in "${filelist[@]}"; do
    dir=$(exec dirname "$f") ## Or dir=${f%/*}
    dirs[$dir]=$dir
done

printf '%s\n' "${dirs[@]}"

Or if input is from file:

readarray -t files < filelist
for f in "${files[@]}"; do
    dir=$(exec dirname "$f") ## Or dir=${f%/*}
    dirs[$dir]=$dir
done
  • Let's keep unnecessary forks on a subshell minimum with exec.
konsolebox
  • 72,135
  • 12
  • 99
  • 105
  • 1
    You don't need to set the value to `$dir`, just the key: `dirs[$dir]=1`. You can then use `"${!dirs[@]}"` to get the list of keys, and `if [ "${dirs[FOO]:-}" ]` to test whether `"FOO"` is in the set. – Keith Thompson Aug 12 '14 at 18:40
  • @KeithThompson Yes I considered that. However I still chose to add the value anyway and use `"${dirs[@]}"`. – konsolebox Aug 12 '14 at 18:44
  • You could apply that parameter expansion to the array itself in the second example: `for file in "${files[@]%/*}"…` (Also, you confuse `f` and `file`.) – kojiro Aug 12 '14 at 18:45
  • @kojiro Considered too. However I have doubts that `dir=${#/*}` is an accurate replacement for `dirname`. I recall thinking of probable issues with it and I think it needs more adjustments. – konsolebox Aug 12 '14 at 18:46
  • The most likely issue is that you meant `%`, not `#`. Other than that, I think it's fine. – kojiro Aug 12 '14 at 18:48
  • So you're duplicating information for the sake of convenience. (That's not a criticism, BTW; it's a perfectly valid choice.) – Keith Thompson Aug 12 '14 at 18:52
  • 1
    @KeithThompson For less complexity. It's easier to process values over keys when you want to do other things like `"${x[@]//something}"`. – konsolebox Aug 12 '14 at 18:54
  • @kojiro For start I'm thinking about the root directory. – konsolebox Aug 12 '14 at 18:54
  • With `x=/a/b/c/`, `${x%/*} == /a/b/c` whereas `dirname "$x" == /a/b`. – konsolebox Aug 12 '14 at 18:56
  • associative arrays! perfect... i had no idea bash had such a thing. Thanks! – awm129 Aug 12 '14 at 18:58
  • 1
    @kojiro see this http://stackoverflow.com/a/22402242/632407 for the comparison of the basename/dirname and variable substituions. ;) – clt60 Aug 12 '14 at 19:00
  • @awm129 Associative arrays are a relatively recent (version 4, released in 2009) addition to `bash`. – chepner Aug 12 '14 at 19:15
  • @jm666 oh, yeah, I'm aware that they're not identical. But for the given input the difference is negligible, and there's a lot of value in avoiding N subshells. – kojiro Aug 12 '14 at 19:45
  • @konsolebox, ...though, to argue a counterpoint, you can iterate over keys with `${!foo[@]}`, so support for iterating over values with `${foo[@]}` only saves one character. – Charles Duffy Aug 12 '14 at 19:55
  • @awm129, it was introduced in 4.0 -- so you _do_ need to be wary of systems (mostly OS X) still running 3.x shells. – Charles Duffy Aug 12 '14 at 19:56