8

I'm trying to recursively copy a directory / file structure from one directory to another, keeping only html files. Should be a simple case of include / exclude shouldn't it?

I just want to print out the files first. When I get that right, I'll copy them.

rsync -a --list-only -v SOURCEDIR --exclude='.*' --include='**/*.html' 

Gives me all the files.

rsync -a --list-only -v SOURCEDIR --include='**/*.html' --exclude='*' 

and

rsync -a --list-only -v SOURCEDIR --include='*.html' --exclude='*' 
rsync -a --list-only -v SOURCEDIR --include=*.html --exclude=*

Give me no files.

rsync -a --list-only -v SOURCEDIR --include='*.html' --exclude='*.*'

Looks like it gives me the whole directory structure and only html files. But I don't want empty directories.

Help!

On Mac OS 10.6

Joe
  • 529
  • 8
  • 18

3 Answers3

12

Rsync can be confusing about selective copies like this. I use the following to do the task that you're asking for:

rsync -avP \
--filter='+ */' \
--filter='+ **/*.html' \
--filter='- *' \
--prune-empty-dirs \
--delete \
/source/ \
/dest/

Basically you need to include all directories in the search, then add all *.html files to the list, the exclude all other files.

The --prune-empty-dirs option is handy to use as it excludes any directory that doesn't have a *.html file.

Shane Meyers
  • 1,008
  • 1
  • 7
  • 17
3

Have you considered using find to do your hard work?

Something along the lines of

find ./ -name "*.html" -exec rsync -R {} /target/base/directory/ \; 

will recreate the directory tree of ./ in which html files are found, and build the same under /target/base/directory

Matt Simmons
  • 20,396
  • 10
  • 68
  • 116
0

I'm not 100% sure this is the best[0] way to do it, but you can add a very slight tweak to your last attempt and make it work. Just add the prune directories option (--prune or -m).

rsync -am --list-only -v SOURCEDIR --include='*.html' --exclude='*.*'

[0] By 'best', I mean the cleanest and most efficient way. It seems like there should be a more elegant way of expressing this, but I don't know offhand what it is.

Christopher Cashell
  • 9,128
  • 2
  • 32
  • 44