1

I am trying to sync two s3 buckets:

s4cmd --dry-run sync s3://cgl-rnaseq-recompute-fixed/gtex s3://rnaseq.toil.20k/gtex

But I am getting the following error:

[Exception] An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied
[Thread Failure] An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied

The source bucket is publicly available. The second bucket is mine and I have access to it:

[centos@ip-172-30-3-12 data]$ s4cmd ls s3://rnaseq.toil.20k/
                 DIR s3://rnaseq.toil.20k/gtex/
                 DIR s3://rnaseq.toil.20k/pnoc/
                 DIR s3://rnaseq.toil.20k/target/
                 DIR s3://rnaseq.toil.20k/tcga/

Also I cannot ls on the source bucket using s4cmd but I can using s3cmd:

[centos@ip-172-30-3-12 data]$ s4cmd ls s3://cgl-rnaseq-recompute-fixed/gtex
[Exception] An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied
[Thread Failure] An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied

[centos@ip-172-30-3-12 data]$ s3cmd ls --requester-pays s3://cgl-rnaseq-recompute-fixed/gtex
                       DIR   s3://cgl-rnaseq-recompute-fixed/gtex/
2016-06-03 17:02    435553   s3://cgl-rnaseq-recompute-fixed/gtex-manifest

What could be going wrong? Any suggestions would be much appreciated.

Komal Rathi
  • 4,164
  • 13
  • 60
  • 98

1 Answers1

0

To achieve the s3cmd behavior, use wildcards:

s4cmd sync s3://bucket/path/dirA/* s3://bucket/path/dirB/

Note s4cmd doesn't support dirA without trailing slash indicating dirA/* as what rsync supported.

So in you case you have to use.

s4cmd --dry-run sync s3://cgl-rnaseq-recompute-fixed/gtex/* s3://rnaseq.toil.20k/gtex

Check this documentation for s4cmd it is very helpful.

https://github.com/bloomreach/s4cmd

Piyush Patil
  • 14,512
  • 6
  • 35
  • 54