How to fake task reporting in hadoop job?

Question

I am using hadoop 1.0.3 to run some data crunching jobs. My reducer does not write to the HDFS, instead, I make my reducer write the result directly to mongoDB. Recently I have started to face a problem; my jobs some times "timeout" and restart and the message that I get from hadoop console is "Task attempt_201301241103_0003_m_000001_0 failed to report status for 601 seconds". So I think the problem lies with my approach, which is to write to mongodb instead of HDFS. I want to fake hadoop job status report. How can I do that ? Please help.

Also, I have observed that my reducer always remains 0% and only the Map phase shows constant increment in %. As soon as the job completes, the reducer shows 100% all of a sudden.

Thankyou, Regards, Mohsin

Can you show us your reducer code? – Thomas Jungblut Jan 24 '13 at 08:57 — Thomas Jungblut, Jan 24 '13 at 08:57

score 1 · Accepted Answer · answered Jan 24 '13 at 09:00

The message on the console you are seeing is from a map phase. Notice the "m" in it. To keep sending progress, you can do context.progress(); in the map method. http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapreduce/StatusReporter.html

How to fake task reporting in hadoop job?

1 Answers1