I'm having trouble running an example mrjob (https://github.com/Yelp/mrjob) with EMR on AWS.
Generate the following error:
Using configs in /home/ciceromoura/.mrjob.conf
Creating temp directory /tmp/MR-DataMining-3.ciceromoura.20200606.202114.850991
writing master bootstrap script to /tmp/MR-DataMining-3.ciceromoura.20200606.202114.850991/b.sh
uploading working dir files to s3://datalake-exemplo/tmp/MR-DataMining-3.ciceromoura.20200606.202114.850991/files/wd...
Copying other local files to s3://datalake-exemplo/tmp/MR-DataMining-3.ciceromoura.20200606.202114.850991/files/
Created new cluster j-3342SIBA7GY23
Added EMR tags to cluster j-3342SIBA7GY23: __mrjob_label=MR-DataMining-3, __mrjob_owner=ciceromoura, __mrjob_version=0.7.3
Waiting for Step 1 of 2 (s-2Z88F1LWZ8HPL) to complete...
CANCELLED (Job terminated)
Cluster j-3342SIBA7GY23 was TERMINATED_WITH_ERRORS: The given SSH key name was invalid
Step 1 of 2 failed
Terminating cluster: j-3342SIBA7GY23
My configuration file (mrjob.conf):
runners:
emr:
aws_access_key_id: xxxxxxxxxxx
aws_secret_access_key: xxxxxxxxxxxxx
ec2_key_pair: EMR
ec2_key_pair_file: ~/.ssh//EMR.pem
ssh_tunnel: true
instance_type: m5.xlarge
num_core_instances: 3
The command executed:
python3 MR-DataMining-3.py -r emr s3://bucket/file.txt --output-dir=s3://bucket/output/ --cloud-tmp-dir=s3://bucket/tmp
I already checked the ssh key, changed it, generated another one, but the error persists. The cluster is created automatically, right? What am I doing wrong? Do you need AMI?