0

I have been given a multi-step Cascading program that runs in about ten times the amount of time that an equivalent M/R job runs. How do I go about figuring out which of the steps is running the slowest so I can target it for optimization?

Robert Rapplean
  • 672
  • 1
  • 9
  • 30

1 Answers1

0

Not a complete answer, but enough to get you started I think. You need to generate a graphical representation of the MapReduce workflow for your job. See this page for an example: http://www.cascading.org/multitool/. The graph should help with trying to figure out where the bottleneck is.

mohit6up
  • 4,088
  • 3
  • 17
  • 12