15

I have the following I want to do:

find . -maxdepth 6 \( -name \*.tar.gz -o -name bediskmodel -o -name src -o -name ciao -o -name heasoft -o -name firefly -o -name starlink -o -name Chandra \) -prune -o -print | tar  cvf somefile.tar --files-from=-

I.e., exclude a whole lot of stuff, only look to six subdirectories depth, and then once pruning is done, 'tar' up the rest.

It is not hard. The bit before the pipe (|) works 100%. If I exclude the 'tar', then I get what I'm after (to the screen). But once I include the pipe, and the tar, it tars everything, including all the stuff I've just excluded in the 'find'.

I've tried a number of different iterations:

-print0 | xargs -0 tar rvf somefile.tar
-print0 | xargs -0 tar rvf somefile.tar --null --files-from=-
-print0 | tar cvf somefile.tar --null -T -

So what am I doing wrong? I've done this before; but now it's just giving me grey hairs.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
zinkeldonk
  • 151
  • 1
  • 1
  • 5
  • I believe you need quotes around the `*.tar.gz` to avoid it getting expanded by `bash` before passed to `find` – drevicko May 06 '13 at 01:07

6 Answers6

27

A combination of the -print flag for find, and then --files-from on the 'tar' command worked for me. In my case I needed to tar up 5000+ log files, but just using 'xargs' only gave me 500 files in the resulting file.

find . -name "*.pdf" -print | tar -czf pdfs.tar.gz --files-from -

You have "--files-from=-", when you just want "--files-from -" and then I think you need a - in front of cvf, like the following.

find . -maxdepth 6 ( -name *.tar.gz -o -name bediskmodel -o -name src -o -name ciao -o -name heasoft -o -name firefly -o -name starlink -o -name Chandra ) -prune -o -print| tar -cvf somefile.tar.gz --files-from -
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
david.tanner
  • 529
  • 3
  • 8
  • 13
  • 4
    This is the correct solution because using `--files-from -` avoids the issue with xargs limits (clear in the comments on @rajshenoy's example) that results in incomplete archives. – mako Jun 22 '15 at 22:04
  • print0 is better: find . -name "*.pdf" -print0 | tar -czf pdfs.tar.gz --null --files-from - – Roland Jul 01 '22 at 11:10
6

I remember doing something like the below line to 'tar' a bunch of files together. I was specific about the files I wanted to group, so I ran something like this:

find . -name "*.xyz" | xargs tar cvf xyz.tar;

In your case, I wonder why you are doing "-o" before the -print that seems to be including everything again.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
rajshenoy
  • 501
  • 1
  • 7
  • 16
  • 5
    correct me if I'm wrong, but I believe that if you have many files output by `find`, `xargs` will run `tar` multiple times with subsets of the file list. Unfortunately, the `-c` then overwrites the previous tar files and you only get the last lot in your final tar file. – drevicko May 06 '13 at 00:44
  • You can try it. I successfully got a tar file of 7-8 files that i searched using find. What happens here is, Find returns output and xargs feeds them to the tar generating a single tar file – rajshenoy Jun 10 '13 at 14:35
  • 2
    @jajshenoy 7-8 files probably isn't enough to uncover this problem. Have a look at the section 'Maximum command length' in [this page](http://offbytwo.com/2011/06/26/things-you-didnt-know-about-xargs.html). Try `echo | xargs --show-limits` to see the size of the command line buffer xargs is using - for me it's 131072. That's quite large, but if you've a few thousand files, it's quickly used up! – drevicko Jun 11 '13 at 04:24
  • 1
    @Drevicko - Thanks, You are correct. I will keep this in mind – rajshenoy Jun 13 '13 at 16:04
  • @drevicko Does that mean we have to do this in two steps ? Like building tarball with `$(cat mylist) | xargs tar -rf myArchive.tar` then gzip the tarball `gzip myArchive.tar` ? – Stphane Dec 17 '15 at 00:45
  • @Stphane use @david.tanner's solution: ask `find` to output file contents and `tar` to read from stdin. – drevicko Jan 07 '16 at 16:07
4

If your 'find' is returning directories, then those will be passed to 'tar', and the full contents will be included, regardless of the exclusions in your 'find' command.

So, I think you need to include a "-type f" in the 'find'.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Joe Watkins
  • 1,593
  • 4
  • 15
  • 25
0

I use a combination of the two previous approaches. To backup a day's work I do this:

rm -rf new.tgz; find . -type f -mtime 0 | xargs tar cvf new.tgz;
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Thomas Altfather Good
  • 1,431
  • 1
  • 9
  • 4
0

To use files-from without an option was the only way to make it work for me. All other options included all files in the directory rather than my generated list.

This was my solution:

find . ! -name '*.gz' -print | xargs tar cvzf ../logs.tar.gz --files-from
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Rodo
  • 1,578
  • 1
  • 14
  • 11
  • Wouldn't [using option `-type f` for `find`](https://stackoverflow.com/questions/11540964/find-with-xargs-and-tar/15529687#15529687) fix that? – Peter Mortensen Feb 16 '21 at 14:26
0

This work for me, where ARG can be any name var

find . -name "*.tar.gz" -print | xargs -I ARG tar -xvzf ARG

...

Louis Loudog Trottier
  • 1,367
  • 13
  • 26