1

I am copying 800 avro files, size around 136 MB, from HDFS to S3 on EMR cluster, but Im getting this exception:

8/06/26 10:53:14 INFO mapreduce.Job:  map 100% reduce 91%
18/06/26 10:53:14 INFO mapreduce.Job: Task Id : attempt_1529995855123_0003_r_000006_0, Status : FAILED
Error: java.lang.RuntimeException: Reducer task failed to copy 1 files: hdfs://url-to-aws-emr/user/hadoop/output/part-00258-3a28110a-9270-4639-b389-3e1f7f386ed6-c000.avro etc
        at com.amazon.elasticmapreduce.s3distcp.CopyFilesReducer.cleanup(CopyFilesReducer.java:67)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:179)
        at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:635)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:390)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

The configuration for the EMR cluster is:

core-site           fs.trash.checkpoint.interval     60
core-site           fs.trash.interval                60
hadoop-env.export   HADOOP_CLIENT_OPTS               -Xmx10g
hdfs-site           dfs.replication                  3

Any help will be appreciated.

Edit:

Running the hdfs dfsadmin -report command, gives the following result:

[hadoop@~]$ hdfs dfsadmin -report
Configured Capacity: 79056308744192 (71.90 TB)
Present Capacity: 78112126204492 (71.04 TB)
DFS Remaining: 74356972374604 (67.63 TB)
DFS Used: 3755153829888 (3.42 TB)
DFS Used%: 4.81%
Under replicated blocks: 126
Blocks with corrupt replicas: 0
Missing blocks: 63
Missing blocks (with replication factor 1): 0

It suggests that the block are missing. Does it mean that I have to re-run the program again? And if I see the output of Under replicated blocks, it says 126. It means 126 blocks will be replicated. How can I know, will it replicate the missing blocks?

Also, the value of Under replicated blocks is 126 for the last 30 minutes. Is there any way to force to it to replicate quickly?

Waqar Ahmed
  • 5,005
  • 2
  • 23
  • 45
  • Co incidentally, I am facing the same issue now but i do not have any missing blocks – Viv Jun 26 '18 at 12:39

1 Answers1

1

I got the same "Reducer task failed to copy 1 files" error and I found logs in HDFS /var/log/hadoop-yarn/apps/hadoop/logs related to the MR job that s3-dist-cp kicks off.

hadoop fs -ls /var/log/hadoop-yarn/apps/hadoop/logs

I copied them out to local:

hadoop fs -get /var/log/hadoop-yarn/apps/hadoop/logs/application_nnnnnnnnnnnnn_nnnn/ip-nnn-nn-nn-nnn.ec2.internal_nnnn

And then examined them in a text editor to find more diagnostic information about the detailed results of the Reducer phase. In my case I was getting an error back from the S3 service. You might find a different error.

Daniel D.
  • 134
  • 1
  • 4
  • 7