Sync with S3 with s3cmd, but not re-download files that only changed name

Question

I'm syncing a bunch of files between my computer and Amazon S3. Say a couple of the files change name, but their content is still the same. Do I have to have the local file removed by s3cmd and then the "new" file re-downloaded, just because it has a new name? Or is there any other way of checking for changes? I would like s3cmd to, in that case, simply change the name of the local file in accordance with the new name on the server.

score 1 · Accepted Answer · answered Oct 18 '14 at 14:47

1

s3cmd upstream (github.com/s3tools/s3cmd master branch) and 1.5.0-rc1 latest published version, can figure this out, if you used a recent version to put the file into S3 in the first place that used the --preserve option to store the md5sum of each file. Using the md5sums, it knows that you have a duplicate (even if renamed) file locally, and won't re-download it, but instead will do a local copy (or hardlink) from the file system name to the name from S3.

answered Oct 18 '14 at 14:47

Matt Domsch

486
2
5

Thank you very much, Matt! Follow-up questions: Can/is the md5sum generated automatically with any other upload method than s3cmd? I mean, if I use say the AWS API on the server side? Or if I use Cyberduck or another client? Also, will it only hard-link? Make do I get it to change the existing local file name accordingly? – Paolo Oct 18 '14 at 21:29
1

For files uploaded in one shot (not using the multipart upload), s3cmd could figure out duplicate files because can get the MD5SUM from the directory listing. Files uploaded with multipart upload (e.g. >15MB by default) can't do so, they need to get the MD5SUM from the metadata, which only s3cmd saves with the file. – Matt Domsch Dec 08 '14 at 02:09

Sync with S3 with s3cmd, but not re-download files that only changed name

1 Answers1