0

I'm working in Java and was able to kick-off a mapreduce job. The job made it through the ShardedJob stage, but is now stuck on the ExamineStatusAndReturnResult stage. In the task queue I see a number of jobs like: /mapreduce/workerCallback/map-hex-string These jobs are all getting re-queued because the return code is 429 Too Many Requests (https://www.rfc-editor.org/rfc/rfc6585#section-4). I feel as though I'm hitting some sort of quota limit, but I cannot figure out where/why.

How can I tell why these tasks are receiving a 429 response code?

Community
  • 1
  • 1
wspeirs
  • 1,321
  • 2
  • 11
  • 22

1 Answers1

0

The mapreduce library tries to avoid getting OOM by doing its own estimated memory consumption bookkeeping (this can be tuned by overriding Worker/InputReader/OutputWriter estimateMemoryRequirement methods, and it work best when MR jobs are running in their own instances [module, backend, version]). Upon receiving an MR request from the task queue the mapreduce library will check the request's estimated memory and if that is less than what is currently available the request will be rejected with HTTP error code 429. To minimize such cases you should either increase the amount of available resources (type, number of instances) and/or decrease the parallel load (number of concurrent jobs, shards per job, and avoid any other type of load on the same instances).

ozarov
  • 1,051
  • 6
  • 7
  • Thanks for the info. Is there any way to know how much memory was used at each stage (is it in the pipeline report?) so I can override this methods and provide better estimates to the engine? – wspeirs Jun 11 '14 at 18:03
  • As mentioned you can override your worker (mapper/reducer) estimateMemoryRequirement method to return the amount of memory that it would need for its operation. The MR library will use this value in addition to the estimate that comes from the input reader and the output writer. – ozarov Jun 11 '14 at 18:50
  • I understand I can tell the library how much memory I *think* it's going to use, but this is just a guess... how can I tell how much it actually used? – wspeirs Jun 15 '14 at 18:05
  • You can tell the Runtime.getRuntime().freeMemory() difference but the MR library takes/uses the estimate before it runs the mapper/reducer per slice. – ozarov Jun 16 '14 at 16:43