Hi I just came up with a strange task:
I run a java-MapReduce jobs with EMR.
The data was about 1T and I used 1 master + 8 slaves.
All of the instances are r2.2xlarge.
Initially, everything looks fine like below:
INFO mapreduce.Job: map 0% reduce 0%
INFO mapreduce.Job: map 1% reduce 0%
INFO mapreduce.Job: map 2% reduce 0%
INFO mapreduce.Job: map 3% reduce 0%
INFO mapreduce.Job: map 4% reduce 0%
INFO mapreduce.Job: map 5% reduce 0%
INFO mapreduce.Job: map 6% reduce 0%
INFO mapreduce.Job: map 7% reduce 0%
...
However, I just noticed that the progress turned to rolling back (fall from like 7% to 1%).
INFO mapreduce.Job: map 4% reduce 0%
INFO mapreduce.Job: map 5% reduce 0%
INFO mapreduce.Job: map 6% reduce 0%
INFO mapreduce.Job: map 7% reduce 0%
INFO mapreduce.Job: map 6% reduce 0%
INFO mapreduce.Job: map 5% reduce 0%
INFO mapreduce.Job: map 4% reduce 0%
INFO mapreduce.Job: map 3% reduce 0%
....
When I test like 3G data, the result is right and the process went smoothly and there is no such situation shows up.
Could anyone tell me why this situation happened?
Best.