31

As you can see in the title I try to sync a folder with a list of files. I hoped that this command would delete all files in dest/ that are not on the list, but it didn't.

So I searched a little bit and know now, that rsync can't do this.

But I need it, so do you know any way to do it?

PS: The list is created by a python script, so it is imaginable that your solution uses some python code.

EDIT, let's be concrete:

The list looks like this:

/home/max/Musik/Coldplay/Parachutes/Trouble.mp3
/home/max/Musik/Coldplay/Parachutes/Yellow.mp3
/home/max/Musik/Coldplay/A Rush of Blood to the Head/Warning Sign.mp3
/home/max/Musik/Coldplay/A Rush of B-Sides to Your Head/Help Is Around the Corner.mp3
/home/max/Musik/Coldplay/B-Sides (disc 3)/Bigger Stronger.mp3

and the command like this:

rsync --delete --files-from=/tmp/list / /home/max/Desktop/foobar/

This works, but if I delete a line, it is not deleted in foobar/.

EDIT 2:

rsync -r --include-from=/tmp/list --exclude=* --delete-excluded / /home/max/Desktop/foobar/

That works neither ...

domids
  • 515
  • 5
  • 21
dAnjou
  • 3,823
  • 7
  • 27
  • 34
  • Btw.: rsync version 3.0.6 protocol version 30 forgot that, sorry – dAnjou Nov 28 '09 at 22:54
  • 4
    One of the things I hate most about rsync, is that lack of support for exactly what you are asking for. Good post. – Felipe Alvarez Jan 08 '13 at 05:39
  • Until now, 2022, rsync still doesn't support this feature :)). I still face the same issue, but I have to rsync many files and extensions so I cannot use the solution --include-from that in the accepted comment. – Trung Nguyen Nov 15 '22 at 07:17

7 Answers7

24

Perhaps you could do this using a list of include patterns instead, and use --delete-excluded (which does as the name suggests)? Something like:

rsync -r --include-from=<patternlistfile> --exclude=* --delete-excluded / dest/

If filenames are likely to contain wildcard characters (*, ? and [) then you may need to modify the Python to escape them:

re.sub("([[*?])", r"\\\1", "abc[def*ghi?klm")

Edit: Pattern-based matching works slightly differently to --files-from in that rsync won't recurse into directories that match the exclude pattern, for reasons of efficiency. So if your files are in /some/dir and /some/other/dir then your pattern file needs to look like:

/some/
/some/dir/
/some/dir/file1
/some/dir/file2
/some/other/
/some/other/dir/
/some/other/dir/file3
...

Alternatively, if all files are in the same directory then you could rewrite the command slightly:

rsync -r --include-from=<patternlistfile> --exclude=* --delete-excluded /some/dir/ dest/

and then your patterns become:

/file1
/file2

Edit: Thinking about it, you could include all directories with one pattern:

/**/

but then you'd end up with the entire directory tree in dest/ which probably isn't what you want. But combining it with -m (which prunes empty directories) should solve that - so the command ends up something like:

rsync -m -r --delete-excluded --include-from=<patternfile> --exclude=* / dest/

and the pattern file:

/**/
/some/dir/file1
/some/other/dir/file3
SimonJ
  • 21,076
  • 1
  • 35
  • 50
  • Thank you, too, but your command ask for -d or -r and neither work. – dAnjou Nov 28 '09 at 22:53
  • 3
    Are the files in a subdirectory? If so, the directory (and its parents) need to be in the pattern list as well, otherwise rsync won't even recurse into them. – SimonJ Nov 28 '09 at 23:18
  • I think at this point, it is worth to test your commands on your system first :P – dAnjou Nov 28 '09 at 23:36
  • AAAAHHH, now i get it. I have to write all parent dirs in the list. – dAnjou Nov 28 '09 at 23:41
  • 3
    Works fine here (apart from missing off -r first time round - that'll teach me for re-typing). I was hoping you'd check the man page and adapt my commands to your situation rather than blindly copy+pasting, though ;) – SimonJ Nov 28 '09 at 23:45
  • 1
    You might not need to, actually - depending on what you want to do with empty directories. Another edit coming up... – SimonJ Nov 28 '09 at 23:46
  • Now it finally works as it should. Thank you so much. I wish i could vote this answer up. – dAnjou Nov 29 '09 at 00:21
  • Just used this to sync an MPD playlist. I had to manually prepend an '/**/' the playlist though. I wish there was some way to not need that part. – tladuke May 27 '11 at 06:37
  • Amazing answer, the last solution saved my day. – Konrad Rudolph Dec 22 '11 at 19:06
  • So if the source directory is `./`, the first pattern should be `./**/` ? Also, if you don't want to edit your pattern file, can you simply have a `--include='./**/'` *and* a `--include-from=filelist` options? – PlasmaBinturong Mar 11 '19 at 21:56
14

This is not exactly the solution, but people coming here might find this useful: Since rsync 3.1.0 there is a --delete-missing-args parameter which deletes files in the destination directory when you sync two directories using --files-from. You would need to specify the deleted files in /tmp/list along with files you do want copied:

rsync --delete-missing-args --files-from=/tmp/list /source/dir /destination/dir

See the man page for more details.

Will Sheppard
  • 3,272
  • 2
  • 31
  • 41
  • 4
    Looked promising, but I had misunderstood what this option does. It will only delete files on the destination if they are listed in the --files-from list but cannot be found on the source. – mivk Apr 23 '19 at 17:02
  • isn't that exactly what was required. – Omid Oct 14 '21 at 11:01
10

As you explained, the command

rsync -r --delete --files-from=$FILELIST user@server:/ $DEST/

does not delete content in the destination when an entry from $FILELIST has been removed. A simple solution is to use instead the following.

mkdir -p $DEST
rm -rf $TEMP
rsync -r --link-dest=$DEST --files-from=$FILELIST user@server:/ $TEMP/
rm -r $DEST
mv $TEMP $DEST

This instructs rsync to use an empty destination. Files that are already present in the link-dest-directory are locally hard-linked and not copied. Finally the old destination is replaced by the new one. The first mkdir creates an empty $DEST if $DEST doesn't exist, to prevent rsync error. (The $-variables are assumed to carry the full path to the respective file or directory.)

There is some minor overhead for the hard-linking, but you don't need to mess with complex include/exclude-strategies.

m4t
  • 183
  • 2
  • 7
4

Inspired from m4t, but using ... rsync for cleanup

rsync -r --link-dest=$dest --files-from=filelist.txt user@server:$source/ $temp
rsync -ra --delete --link-dest=$temp $temp/ $dest
Slaven Rezic
  • 4,571
  • 14
  • 12
131
  • 3,071
  • 31
  • 32
1

Explicit build --exclude-from=... seems the only way to synchronize list of files.

stdin = subprocess.PIPE
other_params.append("--exclude-from=-") #from stdin 

p = subprocess.Popen( 'rsync -e ssh -zthvcr --compress-level=9 --delete'.split() + other_params + [src, dst], stdin =  PIPE)

if relative_files_list != None:
    #hack: listing of excluded files seems the only way to delete unwanted files at destination
    files = set(map(norm_fn, relative_files_list)) #make hash table, for huge lists
    for path, ds, fs in os.walk(src):
        for f in fs:
            rel_path_f = norm_fn(os.path.relpath(os.path.join(path, f), src))
            if rel_path_f not in files:
                #print 'excluding', rel_path_f.replace('\\', '/')
                p.stdin.write(rel_path_f + '\n')
    p.stdin.close()
assert 0 == p.wait()
dobrokot
  • 126
  • 1
  • 2
0

I realize this question was asked a long time ago, but I wasn't satisfied with the answer.

Here is how I solved the problem, assuming a playlist created by mpd:

#!/bin/bash                                                                 

playlist_path="/home/cpbills/.config/mpd/playlists"
playlist="${playlist_path}/${1}.m3u"
music_src="/home/cpbills/files/music"
music_dst="/mnt/sdcard/music/"

if [[ -e "$playlist" ]]; then
  # Remove old files
  find "$music_dst" -type f | while read file; do
    name="$(echo "$file" | sed -e "s!^$music_dst!!")"
    if ! grep -qF "$name" "$playlist"; then
      rm "$file"
    fi
  done

  # Remove empty directories
  find "$music_dst" -type d -exec rmdir {} \; 2>/dev/null

  rsync -vu \
      --inplace \
      --files-from="$playlist" \
      "$music_src" "$music_dst"
else
  printf "%s does not exist\n" "$playlist" 1>&2
  exit 1
fi
cpbills
  • 101
  • 2
-1

rsync is ideal for keeping directories in sync, among other useful things. If you do have an exact copy on the SOURCE, and want to delete files on the DEST, you can delete them from SOURCE and the rsync --delete option will delete them from DEST also.

However, if you just have an arbitrary list of files you want to delete, I suggest you use SSH to accomplish that:

ssh user@remote.host.com rm /path/to/file1 /path/to/file2

This will execute the rm command on the remote host.

Using python, you could:

import subprocess
FileList = ['/path/to/file1', '/path/to/file2']
subprocess.call(['ssh', 'dAnjou@my.server.com', 'rm'] + FileList)

~enjoy

gahooa
  • 131,293
  • 12
  • 98
  • 101
  • 5
    Misunderstanding. I don't have a list of files to delete. I have a list of files to copy. I want those files that are NOT on the list to be deleted. But thanks for your answer. – dAnjou Nov 28 '09 at 22:43