I'm trying to use hadoop on Amazon Elastic MapReduce where I have thousands of map tasks to perform. I'm OK if a small percentage of the tasks fail, however, Amazon shuts down the job and I lose all of the results when the first mapper fails. Is there a setting I can use to increase the number of failed jobs that are allowed? Thanks.
Asked
Active
Viewed 213 times
1 Answers
3
Here's the answer for hadoop:
Is there any property to define failed mapper threshold
To use the setting described above in EMR, look at:
Specifically, you create an xml file (config.xml in the example) with the setting that you want to change and apply bootstrap action:
./elastic-mapreduce --create \ --bootstrap-action s3://elasticmapreduce/bootstrap-actions/configure-hadoop \ --args "-M,s3://myawsbucket/config.xml"

Community
- 1
- 1

user1910316
- 499
- 7
- 17