113

I want to tar and all .php and .html files in a directory and its subdirectories. If I use

tar -cf my_archive *

it tars all the files, which I don't want. If I use

tar -cf my_archive *.php *.html

it ignores subdirectories. How can I make it tar recursively but include only two types of files?

user1566515
  • 1,637
  • 4
  • 17
  • 25

8 Answers8

194

find ./someDir -name "*.php" -o -name "*.html" | tar -cf my_archive -T -

DeeDee
  • 2,641
  • 2
  • 17
  • 21
  • @DeeDee Are there any limitations on the number of files, etc.? – user1566515 Sep 11 '13 at 04:32
  • 1
    @DeeDee - no, what I meant was you don't need the parens! – Mike Makuch Sep 11 '13 at 13:18
  • 1
    @user1566515 There may be some filesystem limit or overall space limit which would put an upper limit on your tar file. That entirely depends on your own system. Otherwise, the piping will essentially create the tar file on-the-fly, so you won't be constrained by file number or size. – DeeDee Sep 11 '13 at 13:46
  • Thanks! ... how to add more than 2 conditions / kind of file? – gluuke Sep 16 '14 at 11:37
  • 6
    @gluuke use `-o -name [pattern]` for each new condition – DeeDee Sep 16 '14 at 14:37
  • @DeeDee : I am sorry, for some reason I am obtaining `find: paths must precede expression Usage: find [-H] [-L] [-P] [path...] [expression]` – gluuke Sep 23 '14 at 16:48
28

If you're using bash version > 4.0, you can exploit shopt -s globstar to make short work of this:

shopt -s globstar; tar -czvf deploy.tar.gz **/Alice*.yml **/Bob*.json

this will add all .yml files that starts with Alice from any sub-directory and add all .json files that starts with Bob from any sub-directory.

Stabledog
  • 3,110
  • 2
  • 32
  • 43
Sairam Krish
  • 10,158
  • 3
  • 55
  • 67
  • 3
    The only answer that just uses tar, the best answer IMO. – simon Nov 12 '17 at 21:09
  • 2
    Despite the impression by glob '**' for directory, this command does not execute recursively (any sub-sub-folders) – Eddie Mar 06 '18 at 23:42
  • @Eddie ** should work. may be there is something different with your parameters. Also check if there is any space in folder name that you pass in the command line. If not, can you paste your actual command ? – Sairam Krish Mar 08 '18 at 09:20
  • '**' is evaluated by the shell before reaching the command and it only seen as 2 independent * which resolves to 0 or characters, it has no recursive functionality to span directories http://tldp.org/LDP/GNU-Linux-Tools-Summary/html/x11655.htm – Eddie Mar 19 '18 at 19:29
  • 4
    @eddie yes it is evaluated by shell, though bash > 4.0 has a `shopt -s globstar` option, so the answer is correct and is actually the best one – Roman Usherenko Aug 02 '18 at 16:14
  • 1
    `-bash: /usr/bin/tar: Argument list too long` The expansion happens before being passed to tar and fails for large numbers of files. – Neek Dec 09 '22 at 02:55
  • @dmitry_podyachev solution below works well, using find to generate a list of files, then `tar -czf file.tar -T files.txt` to tar the files named in `files.txt`. – Neek Dec 09 '22 at 03:01
21

One method is:

tar -cf my_archive.tar $( find -name "*.php" -or -name "*.html" )

There are some caveats with this method however:

  1. It will fail if there are any files or directories with spaces in them, and
  2. it will fail if there are so many files that the maximum command line length is full.

A workaround to these could be to output the contents of the find command into a file, and then use the "-T, --files-from FILE" option to tar.

steampowered
  • 11,809
  • 12
  • 78
  • 98
Robin Sheat
  • 396
  • 1
  • 5
  • 1) By "fail" do you mean the files with spaces will be skipped or the tar archive will not be created? 2) I have about 100K files. Is that over the maximum command line length? – user1566515 Sep 11 '13 at 04:30
  • 1
    1. It will create the archive, but it will report missing files. 2. That will be too long, I expect. Given this, you'd be best using a method like @DeeDee suggests below, it'll work around these problems quite nicely. – Robin Sheat Sep 11 '13 at 05:03
4

This will handle paths with spaces:

find ./ -type f -name "*.php" -o -name "*.html" -exec tar uvf myarchives.tar {} +
2

If you want to produce a zipped tar file (.tgz) and want to avoid problems with spaces in filenames:

find . \( -name \*.php -o -name \*.html \) -print0 | xargs -0 tar -cvzf my_archive.tgz

The -print0 “primary” of find separates output filenames using the NULL (\0) byte, thus playing well with the -0 option of xargs, which appends its (NULL-separated, in this case) input as arguments to the command it precedes.

The parentheses around the two -name primaries are needed, because otherwise the -print0 would only output the filenames of the second -name (there is no implied printing if -print or -print0 is present, and these only have an effect if they are evaluated).

If you need to skip some filenames or directories (e.g., the node_modules directory if you work with Node.js), prepend one or more -prune primaries like this:

find . -name skipThisName -prune -o \
  -name skipThisOtherName -prune -o \
  \( -name \*.php -o -name \*.html \) -print0 | xargs -0 tar -cvzf my_archive.tgz
Walter Tross
  • 12,237
  • 2
  • 40
  • 64
1

Put them in a file

find . \( -name "*.php" -o -name "*.html" \) -print > files.txt

Then use the file as input to tar, use -I or -T depending on the version of tar you use

Use h to copy symbolic links

tar cfh my.tar -I files.txt 
Noam Geffen
  • 339
  • 3
  • 6
  • You mean `-T files.txt`, not `-I files.txt`, but otherwise, this is great for large numbers of files where expansion of `*` or `**` exceeds the shell limit. – Neek Dec 09 '22 at 03:00
1

Easy with zsh:

tar cvzf foo.tar.gz **/*.(php|html)
0

find ./ -type f -name "*.php" -o -name "*.html" -printf '%P\n' |xargs tar -I 'pigz -9' -cf target.tgz

for multicore or just for one core:

find ./ -type f -name "*.php" -o -name "*.html" -printf '%P\n' |xargs tar -czf target.tgz