I have java code which tries to initialize a remote filesystem from S3 using coniguration (this was previously on HDFS and I try to move this to s3 without modifying the code too much).
This is the config:
fs.s3a.aws.credentials.provider=com.amazonaws.auth.DefaultAWSCredentialsProviderChain
fs.defaultFS=s3a://mybucket.devrun.algo-resources/
Then, in the setup I use
hdfsFileSystem = FileSystem.get(conf);
This results in the following exception:
org.apache.hadoop.fs.s3a.AWSS3IOException: doesBucketExist on mybucket.devrun.algo-resources: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: 1E91F85FA3751C44), S3 Extended Request ID: 5KDgH7lsaIX7l5DQcdBdUjeg/qxYgOEU4WJBOL0p090kqNNlYOAie31zuYUQw+R3LN4CvavdVJk=: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: 1E91F85FA3751C44)
at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:178)
at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:282)
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:236)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2811)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:100)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2848)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2830)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:389)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:181)
at pipeline.HdfsSyncUtils.<init>(HdfsSyncUtils.java:32)
at pipeline.QueriesForArticles$QFAMapper.setup(QueriesForArticles.java:158)
at AWSPipeline.C2SIndexSearchingAlgo.<init>(C2SIndexSearchingAlgo.java:41)
at AWSPipeline.ABTestMainRunner.main(ABTestMainRunner.java:27)
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: 1E91F85FA3751C44), S3 Extended Request ID: 5KDgH7lsaIX7l5DQcdBdUjeg/qxYgOEU4WJBOL0p090kqNNlYOAie31zuYUQw+R3LN4CvavdVJk=
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1182)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:770)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3785)
at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1107)
at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1070)
at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:276)
... 11 more
I use hadoop-aws.jar
version 2.8.0
I use awsfed
to create a credentials file under ~\.aws
, so I think I have my credentials right.
Any idea what is the meaning of this error? There's no detailed error message...
Edit:
For anyone interested, I solved this: Following this answer, I came to the conclusion that this is region-related. My bucket is on us-east-2 region. I tried opening a bucket in another region and it worked!
This is probably related to what can be seen in the docs - S3 in us-east-2 supports only version 4 signing, and my code (hadoop-aws.jar 2.8.0
) probably uses an older version.