0

I have to backup my DynamoDb table into S3 but when i launch this service I receive this error after three attempts:

private.com.amazonaws.AmazonServiceException: User: arn:aws:sts::769870455028:assumed-role/DataPipelineDefaultResourceRole/i-3678d99c is not authorized to perform: elasticmapreduce:ModifyInstanceGroups (Service: AmazonElasticMapReduce; Status Code: 400; Error Code: AccessDeniedException; Request ID: 9065ea77-0f95-11e5-8f35-39a70915a1ef) at private.com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1077) at private.com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:725) at private.com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:460) at private.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:295) at private.com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduceClient.invoke(AmazonElasticMapReduceClient.java:1391) at private.com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduceClient.modifyInstanceGroups(AmazonElasticMapReduceClient.java:785) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at private.com.amazonaws.services.datapipeline.retrier.RetryProxy.invokeInternal(RetryProxy.java:36) at private.com.amazonaws.services.datapipeline.retrier.RetryProxy.invoke(RetryProxy.java:48) at com.sun.proxy.$Proxy33.modifyInstanceGroups(Unknown Source) at amazonaws.datapipeline.cluster.EmrUtil.acquireCoreNodes(EmrUtil.java:325) at amazonaws.datapipeline.activity.AbstractClusterActivity.resizeIfRequired(AbstractClusterActivity.java:47) at amazonaws.datapipeline.activity.AbstractHiveActivity.runActivity(AbstractHiveActivity.java:113) at amazonaws.datapipeline.objects.AbstractActivity.run(AbstractActivity.java:16) at amazonaws.datapipeline.taskrunner.TaskPoller.executeRemoteRunner(TaskPoller.java:132) at amazonaws.datapipeline.taskrunner.TaskPoller.executeTask(TaskPoller.java:101) at amazonaws.datapipeline.taskrunner.TaskPoller$1.run(TaskPoller.java:77) at private.com.amazonaws.services.datapipeline.poller.PollWorker.executeWork(PollWorker.java:76) at private.com.amazonaws.services.datapipeline.poller.PollWorker.run(PollWorker.java:53) at java.lang.Thread.run(Thread.java:745)

How can I do my backup? Does someone have this error? thanks

edit: new policy

{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:", "dynamodb:", "ec2:Describe*", "elasticmapreduce:Describe*", "elasticmapreduce:ListInstance*", "elasticmapreduce:AddJobFlowSteps", "elasticmapreduce:", "rds:Describe", "datapipeline:", "cloudwatch:", "redshift:DescribeClusters", "redshift:DescribeClusterSecurityGroups", "sdb:", "sns:", "sqs:" ], "Resource": [ "" ] } ]

This is the new exception :

Error during job, obtaining debugging information... Examining task ID: task_1434014832347_0001_m_000008 (and more) from job job_1434014832347_0001 Examining task ID: task_1434014832347_0001_m_000013 (and more) from job job_1434014832347_0001 Examining task ID: task_1434014832347_0001_m_000005 (and more) from job job_1434014832347_0001 Examining task ID: task_1434014832347_0001_m_000034 (and more) from job job_1434014832347_0001 Examining task ID: task_1434014832347_0001_m_000044 (and more) from job job_1434014832347_0001 Examining task ID: task_1434014832347_0001_m_000004 (and more) from job job_1434014832347_0001 Task with the most failures(4): ----- Task ID: task_1434014832347_0001_m_000002 URL: http://ip-10-37-138-149.eu-west-1.compute.internal:9026/taskdetails.jsp?jobid=job_1434014832347_0001&tipid=task_1434014832347_0001_m_000002 ----- Diagnostic Messages for this Task: Error: Java heap space FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs

luca
  • 3,248
  • 10
  • 66
  • 145
  • Could you share your pipeline id? Thx – AravindR Jun 11 '15 at 01:12
  • the policy that you posted in edit looks to be a typo. "elasticmapreduce:". shouldn't it be ""elasticmapreduce:*" – AravindR Jun 11 '15 at 17:52
  • this is a viewing error (when I want to corretc it, I see the right value), but yesterday I had the above exception (updated first post) – luca Jun 12 '15 at 08:13

1 Answers1

1

Datapipeline agent (TaskRunner) running on your EMR cluster is trying to resize the EMR cluster and it is failing. Your resource role that you passed to EMR cluster does not have permissions to invoke the following api AmazonElasticMapReduce::modifyInstanceGroups.

I just looked at the DefaultResourceRolePolicy, which is created using a wizard in console, (http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-iam-roles.html ) These are the allowed policies for emr: "elasticmapreduce:Describe*", "elasticmapreduce:ListInstance*", "elasticmapreduce:AddJobFlowSteps"

and i found that it does not allow ModifyInstanceGroups.
Please update your resource role policy to allow that. E.g.,"elasticmapreduce:*"

Thx for reporting this bug. In the meanwhile, we will work on fixing the console wizard generated default resource role policy.

Aravind R.

AravindR
  • 677
  • 4
  • 11
  • Thank you for your reply. Now I run the pipeline with id df-06906701V4O8V298WMPJ. The old pipeline was deleted. The dynamoDB table is big so when the backup finish I update you. Now i use the policy in the first post: – luca Jun 11 '15 at 09:12