I have been given a multi-step Cascading program that runs in about ten times the amount of time that an equivalent M/R job runs. How do I go about figuring out which of the steps is running the slowest so I can target it for optimization?
Asked
Active
Viewed 76 times
1 Answers
0
Not a complete answer, but enough to get you started I think. You need to generate a graphical representation of the MapReduce workflow for your job. See this page for an example: http://www.cascading.org/multitool/. The graph should help with trying to figure out where the bottleneck is.

mohit6up
- 4,088
- 3
- 17
- 12