Generating multiple equally sized output files in Hadoop

Question

What are some methods for finding X data ranges in Hadoop so that one can use these ranges as partitions in the reducer step?

score 0 · Answer 1 · edited Jun 20 '20 at 09:12

0

Looks like you need something like TotalOrderPartitioner, which allows a total order by reading split points from an externally generated source. You might find this link useful : http://chasebradford.wordpress.com/2010/12/12/reusable-total-order-sorting-in-hadoop/.

Don't know if this is exactly what you need? Apologies if I have get it wrong.

edited Jun 20 '20 at 09:12

Community

answered Jun 19 '13 at 19:49

Tariq

1 Answers1