22

I'm trying to ssh into Amazon EMR Spark Cluster. Here's what I did:

  1. Get the cluster master's IP:

    aws emr describe-cluster --cluster-id <cluster_id> | grep MasterPublicDnsName
    
  2. Use the IP to ssh into the box:

    ssh -i CSxxx.pem hadoop@ec2-xx-xxx-xxx-xxx.ap-southeast-1.compute.amazonaws.com
    

I'm getting stuck here, as running (2) gives me the below error:

ssh: connect to host ec2-xx-xxx-xxx-xxx.ap-southeast-1.compute.amazonaws.com port 22: Operation timed out

Any ideas to fix this issue?

xpm
  • 353
  • 2
  • 10

2 Answers2

42

"Operation timed out" Happens typically for one of two reasons:

  • The IP you're ssh'ing from is not allowed by the EMR cluster's security group. Check this by going to the cluster's console / dashboard and find security group, click it, then edit "inbound rules" and add a line for SSH and in the IP field, dropdown and select your IP.

  • or, if you've created the EMR cluster in a custom VPC and the cluster itself is launched into a private subnet, you'll not be able to directly SSH into it, without first SSH'ing into an instance in a public subnet in that same VPC, then SSH'ing to the cluster's driver node from there. This is a less likely issue if you don't have custom VPCs on your AWS account.

Kristian
  • 21,204
  • 19
  • 101
  • 176
25

Adding steps to update ssh rule. The security group is in EC2 dashboard.

1) Navigate to EC2 dashboard -> security group

2) Find group ElasticMapReduce-master -> Inbound -> Edit -> Add rule

3) Add ssh, for source choose My IP

Now you should be able to ssh to the master node.

Fang Zhang
  • 1,597
  • 18
  • 18