1

I have an EMR which is spinning up in eu-west-1 private subnet. I have defined a gateway endpoint for S3 in the route table. I have to access this public bucket/location exposed by AWS: s3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar which is giving below error. I think this is because of cross-region access through gateway endpoint which is not allowed. I am able to access other buckets which are in the same region. Is there a workaround to access this, maybe through NAT? The route table already has a NAT but the request is somehow not going through that.

2019-04-10T05:17:06.849Z INFO Ensure step 1 jar file s3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar
INFO Failed to download: s3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar
java.lang.RuntimeException: Error whilst fetching 's3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar'
    at aws157.instancecontroller.util.S3Wrapper.fetchS3HadoopFileToLocal(S3Wrapper.java:412)
    at aws157.instancecontroller.util.S3Wrapper.fetchHadoopFileToLocal(S3Wrapper.java:351)
    at aws157.instancecontroller.master.steprunner.HadoopJarStepRunner$Runner.<init>(HadoopJarStepRunner.java:243)
    at aws157.instancecontroller.master.steprunner.HadoopJarStepRunner.createRunner(HadoopJarStepRunner.java:152)
    at aws157.instancecontroller.master.steprunner.HadoopJarStepRunner.createRunner(HadoopJarStepRunner.java:146)
    at aws157.instancecontroller.master.steprunner.StepExecutor.runStep(StepExecutor.java:136)
    at aws157.instancecontroller.master.steprunner.StepExecutor.run(StepExecutor.java:70)
    at aws157.instancecontroller.master.steprunner.StepExecutionManager.enqueueStep(StepExecutionManager.java:248)
    at aws157.instancecontroller.master.steprunner.StepExecutionManager.doRun(StepExecutionManager.java:195)
    at aws157.instancecontroller.master.steprunner.StepExecutionManager.access$000(StepExecutionManager.java:33)
    at aws157.instancecontroller.master.steprunner.StepExecutionManager$1.run(StepExecutionManager.java:94)
Caused by: com.amazonaws.AmazonClientException: Unable to execute HTTP request: connect timed out
    at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:618)
    at com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:376)
    at com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:338)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:287)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3826)
    at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1143)
    at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1021)
    at aws157.instancecontroller.util.S3Wrapper.copyS3ObjectToFile(S3Wrapper.java:303)
    at aws157.instancecontroller.util.S3Wrapper.getFile(S3Wrapper.java:287)
    at aws157.instancecontroller.util.S3Wrapper.fetchS3HadoopFileToLocal(S3Wrapper.java:399)
    ... 10 more
ishan3243
  • 1,870
  • 4
  • 30
  • 49
  • 1
    If you already have a NAT Gateway, then it should handle this traffic automatically. `Unable to execute HTTP request: connect timed out` implies that the NAT Gateway is misconfigured -- not included in the route table, or perhaps deployed on the same subnet that it is intended to serve, which is not correct. An S3 gateway endpoint will never try to route cross-region traffic. – Michael - sqlbot Apr 10 '19 at 18:57
  • @Michael-sqlbot Yes, it is deployed in the same subnet its supposed to serve. How is that a problem? – ishan3243 Apr 10 '19 at 19:12
  • 1
    A NAT Gateway must be in a public subnet, with the default route for that subnet's route table pointing to the Internet Gateway. If deployed in the subnet it serves, when the NAT Gateway tries to access the Internet, its outgoing traffic loops right back to itself, because the default route for the subnet points to the NAT Gateway. – Michael - sqlbot Apr 10 '19 at 19:21
  • @Michael-sqlbot Thanks! Please convert to answer. – ishan3243 Apr 11 '19 at 07:52

1 Answers1

1

An S3 gateway endpoint will never try to route cross-region traffic, but a NAT Gateway should handle this traffic automatically. Given the assertion that a NAT Gateway is in place, then Unable to execute HTTP request: connect timed out implies that the NAT Gateway (or a setting associated with it) is misconfigured.

As noted in comments, the specific issue here was that the NAT Gateway was provisioned on the same subnet it was intended to serve. This isn't a valid configuration, because in this case the NAT Gateway tries to reach the Internet... via itself... since it gets its default route from the subnet where it's deployed.

To create a NAT gateway, you must specify the public subnet in which the NAT gateway should reside.

...

After you've created a NAT gateway, you must update the route table associated with one or more of your private subnets to point Internet-bound traffic to the NAT gateway. This enables instances in your private subnets to communicate with the internet. (emphasis added)

https://docs.aws.amazon.com/vpc/latest/userguide/vpc-nat-gateway.html#nat-gateway-basics

Michael - sqlbot
  • 169,571
  • 25
  • 353
  • 427