I am copying 800 avro files, size around 136 MB, from HDFS to S3 on EMR cluster, but Im getting this exception:
8/06/26 10:53:14 INFO mapreduce.Job: map 100% reduce 91%
18/06/26 10:53:14 INFO mapreduce.Job: Task Id : attempt_1529995855123_0003_r_000006_0, Status : FAILED
Error: java.lang.RuntimeException: Reducer task failed to copy 1 files: hdfs://url-to-aws-emr/user/hadoop/output/part-00258-3a28110a-9270-4639-b389-3e1f7f386ed6-c000.avro etc
at com.amazon.elasticmapreduce.s3distcp.CopyFilesReducer.cleanup(CopyFilesReducer.java:67)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:179)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:635)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:390)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
The configuration for the EMR cluster is:
core-site fs.trash.checkpoint.interval 60
core-site fs.trash.interval 60
hadoop-env.export HADOOP_CLIENT_OPTS -Xmx10g
hdfs-site dfs.replication 3
Any help will be appreciated.
Edit:
Running the hdfs dfsadmin -report command, gives the following result:
[hadoop@~]$ hdfs dfsadmin -report
Configured Capacity: 79056308744192 (71.90 TB)
Present Capacity: 78112126204492 (71.04 TB)
DFS Remaining: 74356972374604 (67.63 TB)
DFS Used: 3755153829888 (3.42 TB)
DFS Used%: 4.81%
Under replicated blocks: 126
Blocks with corrupt replicas: 0
Missing blocks: 63
Missing blocks (with replication factor 1): 0
It suggests that the block are missing. Does it mean that I have to re-run the program again? And if I see the output of Under replicated blocks, it says 126. It means 126 blocks will be replicated. How can I know, will it replicate the missing blocks?
Also, the value of Under replicated blocks is 126 for the last 30 minutes. Is there any way to force to it to replicate quickly?