I have a series map-reduce jobs to process user data (implemented using the Cascading framework), and I would like to track lots of fine-grained statistics (I can have between 100 and 1000 users and 20 statistics per user, so, possibly between 5000 and 10.000 statistics in total). I wanted to use map-reduce counters to build those stats because it is very convenient to use them in the code, but there is a limit to the number of map-reduce counters (120 by default), and according to this post: http://developer.yahoo.com/blogs/hadoop/posts/2010/08/apache_hadoop_best_practices_a/ I should not use them if i have more than 20/50 custom counters.
Question: is there a proper way to track my statistics in this map-reduce context, using a counter-like pattern ? by counter-like, i mean, to have access to counters everywhere in my code and be able to increment them where needed.
thanks by advance register