Can xargs execute a subshell command for each argument?

Question

I have a command which is attempting to generate UUIDs for files:

find -printf "%P\n"|sort|xargs -L 1 echo $(uuid)

But in the result, xargs is only executing the $(uuid) subshell once:

8aa9e7cc-d3b2-11e4-83a6-1ff1acc22a7e file1
8aa9e7cc-d3b2-11e4-83a6-1ff1acc22a7e file2
8aa9e7cc-d3b2-11e4-83a6-1ff1acc22a7e file3

Is there a one-liner (i.e not a function) to get xargs to execute a subshell command on each input?

@TomFenech: `-n 1` would actually split by any whitespace, whether line-interior or not, so the command would break with paths with embedded whitespace; `-L 1` comes closer to the intent, in that it performs line-by-line processing, but word-splitting is still applied to each line, so that potentially _multiple_ arguments are passed to `echo` per input line (which may or may not cause problems). The robust approach is to use `-I`, as in the accepted answer. — mklement0, Mar 26 '15 at 13:24

hek2mgl · Accepted Answer · 2017-07-31T16:38:27.853

24

This is because the $(uuid) gets expanded in the current shell. You could explicitly call a shell:

find -printf "%P\n"| sort | xargs -I '{}' bash -c 'echo $(uuid) {}'

Btw, I would use the following command:

find -exec bash -c 'echo "$(uuid) ${1#./}"' -- '{}' \;

without xargs.

edited Jul 31 '17 at 16:38

answered Mar 26 '15 at 12:41

hek2mgl

152,036
28
249
266

2

Nicely done; but not only is `-n 1` is superfluous, because `-I` implies line-by-line processing, `-n 1` would actually split by _any_ whitespace, whether line-interior or not. While `-L 1` does perform line-by-line processing, word-splitting is still applied to each line, whereas `-I` treats the entire line as a _single_ argument. – mklement0 Mar 26 '15 at 13:13

score 6 · Answer 2 · edited May 23 '17 at 10:30

hek2mgl's answer explains the problem well and his solution works well; this answer looks at performance.

The accepted answer is a tad slow, because it creates a bash process for every input line.

While xargs is generally preferable to and faster than a shell-code loop, in this particular case the roles are reversed, because shell functionality is needed in each iteration.

The following alternative solution uses a while loop to process the input lines, and, on my machine, is about twice as fast as the xargs solution.

find . -printf "%P\n" | sort | while IFS= read -r f; do echo "$(uuid) $f"; done

Note the use of while rather than for, because for cannot robustly parse command output (in short: filenames with embedded whitespace would break the command - see http://mywiki.wooledge.org/DontReadLinesWithFor).

^{If you're concerned about filenames with embedded newlines (very rare) and use GNU utilities, you could use NUL bytes as separators:}

find . -printf "%P\0" | sort -z | while IFS= read -d '' -r f; do echo "$(uuid) $f"; done

Update: The fastest approach is to not use a shell loop at all, as evidenced by ᴳᵁᴵᴰᴼ's clever answer. See below for a portable version of his answer.

Compatibility note:

The OP's find command implies the use of GNU find (Linux), and uses features (-printf) that may not work on other platforms.

Here's a portable version of ᴳᵁᴵᴰᴼ's answer that uses only POSIX-compliant features of find (and awk).
Note, however, that uuid is not a POSIX utility; since Linux and BSD-like systems (including OSX) have a uuidgen utility, the command uses that instead:

 find . -exec printf '%s\t' {} \; -exec uuidgen \; | 
   awk -F '\t' '{ sub(/.+\//,"", $1); print $2, $1 }' | sort -k2

guido · Answer 3 · 2015-03-26T15:16:27.457

4

~~With a for loop:~~

for i in $(find -printf "%P\n" | sort) ; do echo "$(uuid) $i";  done

Edit: another way to do this:

find -printf "%P\0" -exec uuid -v 4 \; | sort | awk -F'\0' '{ print $2 " " $1}'

this outputs the filename followed by the uuid (no subshell required) for letting the sort to happen, then swaps the two columns separated by null.

edited Mar 26 '15 at 15:16

answered Mar 26 '15 at 12:44

guido

18,864
6
70
95

This also works and is a slightly easier version to read as well as not having the overhead of a new bash on every argument. If I could split the credit, I would. Thanks. – adelphus Mar 26 '15 at 12:51
Using a shell loop in this instance is a good idea for performance reasons, but it's better to use a `while` loop, because `for` will break with filenames with embedded spaces, for instance - see http://mywiki.wooledge.org/DontReadLinesWithFor – mklement0 Mar 26 '15 at 14:02
1

@mklement0 that's very true thanks; anyway I decided this one that's discarding the loop is better – guido Mar 26 '15 at 14:58
1

Nicely done - that's even faster. As an aside: what platform are you (and the OP) on that you have a `uuid` utility? On BSD-like systems and Linux it's `uuidgen`. Interestingly, BSD `awk` interprets `-F'\0'` as `-F ''` (i.e., the _empty string_) and therefore splits the lines into individual characters (however, the `find` command as written wouldn't work with BSD `find` anyway). – mklement0 Mar 26 '15 at 17:51
1

@mklement0 it is this program http://www.ossp.org/pkg/lib/uuid/ packaged for fedora in my case; and GNU findutils 4.5.12 – guido Mar 26 '15 at 22:56
1

@mklement0 ...and on mine, uuid is in the same findutils package in Ubuntu. Interestingly the uuid utility creates time-based id's whereas uuidgen creates random-based id's (by default). This results in a strikingly different output when run in a loop - uuid creates sets of very similar ids, uuidgen creates more randomised values. – adelphus Mar 31 '15 at 14:49
@adelphus: Thanks for that; let me add: _GNU_ `uuidgen` allows explicit control of what type to generate: `-r` for random-based, `-t` for time-based. _BSD_ `uuidgen`, by contrast, doesn't support this, and seemingly _invariably_ creates _random_-based ones. – mklement0 Mar 31 '15 at 15:13

Can xargs execute a subshell command for each argument?

3 Answers3