1

I've been trying various combinations of xargs and piping but I just can't get the right result. Previous questions don't quite cover exactly what I want to do:

  • I have a source directory somewhere, lets say /foo/source, with a mix of different files
  • I want to copy just the csv files found in source to a different destination, say /foo/dest
  • But I ALSO at the same time need to remove 232 header rows (eg using tail)

I've figured out that I need to pipe the results of find into xargs, which can then run commands on each find result. But I'm struggling to tail then copy. If I pipe tail into cp, cp does not seem to receive the file (missing file operand). Here's some examples of what I've tried so far:

find /foo/source -name "*.csv" | xargs -I '{}' sh -c 'tail -n +232 | cp -t /foo/dest'

cp: missing file operand

find /foo/source -name "*.csv" | xargs -I '{}' sh -c 'tail -n +232 {} | cp -t /foo/dest'

Result:

cp: failed to access '/foo/dest': No such file or directory ...

find /foo/source -name "*.csv" | xargs -I '{}' sh -c 'tail -n +232 {} > /foo/dest/{}'

sh: /foo/dest/foo/source/0001.csv: No such file or directory ...

Any pointers would be really appreciated!

Thanks

fedorqui
  • 275,237
  • 103
  • 548
  • 598
James Allen
  • 6,406
  • 8
  • 50
  • 83

4 Answers4

2

Just use find with exec and copy the file name in a variable:

find your_dir -name "*.csv" -exec sh -c 'f="$1"; tail -n +5 "$f" > dest_dir/$(basename "$f")' -- {} \;

See f={} makes $f hold the name of the file, with the full path. Then, it is a matter of redirecting the output of tail into the file, stripping the path from it.

Or, based on Random832's suggestion below in comments (thanks!):

find your_dir -name "*.csv" -exec sh -c 'tail -n +5 "$1" > dest_dir/$(basename "$1")' -- {} \;
Community
  • 1
  • 1
fedorqui
  • 275,237
  • 103
  • 548
  • 598
  • @JamesAllen ok, now I found another way using `-exec`. – fedorqui Sep 06 '16 at 14:06
  • Thanks @fedorqui, Barmar just beat you to it using xargs and basename. Thanks though! – James Allen Sep 06 '16 at 14:11
  • @JamesAllen yep, I always wonder what is best, to pipe to `xargs` or use `-exec`. I prefer the latter, but both ways should be equivalent in most of the cases. – fedorqui Sep 06 '16 at 14:12
  • 2
    This use of exec will not work correctly with filenames that may contain space or other special characters. Use `-exec sh -c 'f="$1"; ...' -- {} \;` (or just use `"$1"` throughout the shell command and don't use a `$f` variable) – Random832 Sep 06 '16 at 14:36
  • @Random832 wo, that's a great hint. Would you mind explaining why is `--` making it work? Updated the answer, many thanks. – fedorqui Sep 07 '16 at 06:23
  • 1
    @fedorqui I actually discovered the need for it by accident and assumed it was always required, but it turns out that the first argument after `-c '...'` gets put in $0. But $0 has a special meaning so I wouldn't recommend using it just to be able to omit the -- argument. – Random832 Sep 07 '16 at 06:32
  • @Random832 ah, I see. So by saying `sh -c '...' -- {} \;` we are passing `--` as the first argument ($0), {} as the second one ($1) and so on, cool! For example `find aaa -name "*.csv" -exec sh -c 'echo $2, $3;' -- -- a b \;` returns `a, b`. I wasn't aware of the way `sh -c` was getting its arguments. – fedorqui Sep 07 '16 at 08:15
  • @Random832 because in fact this also happens when you say `bash -c 'echo "$1"' "var1" "var2"`. It will print var2 - I don't like this inconsistency of `$0` :/ – fedorqui Sep 08 '16 at 07:03
1

Your last command is close, but the problem is that {} is replaced with the full pathname, not just the filename. Use the basename command to extract the filename from it.

find /foo/source -name "*.csv" | xargs -I '{}' sh -c 'tail -n +232 {} > /foo/dest/$(basename {})'
Barmar
  • 741,623
  • 53
  • 500
  • 612
1

As an alternative to find and xargs you could use a for loop, and as an alternative to tail you could use sed, consider this:

source=/foo/source
dest=/foo/dest
for csv in $source/*.csv; do sed '232,$ !d' $csv > $dest/$(basename $csv); done
Jorge Valentini
  • 397
  • 4
  • 17
0

Using GNU Parallel you would do:

find /foo/source -name "*.csv" | parallel tail -n +232 {} '>' /foo/dest/{/}
Ole Tange
  • 31,768
  • 5
  • 86
  • 104