0

To be specific, for example, given

hadoop jar hadoop-streaming.jar \
  -input myInputDirs \
  -output myOutputDir \
  -mapper /bin/cat \
  -reducer /usr/bin/wc

Where myInputDirs has a dated subfolder structure of

   input_dir/yyyy/mm/dd/part-*

I want myOutputDir has the same dated subfolder structure:

   output_dir/yyyy/mm/dd/part-*

Guess there should be an option to do this. Can "-partitioner" or any "-D" option make this?

Osiris
  • 1,007
  • 4
  • 17
  • 30
  • you can use Hive Dynamic Partitions to create Folders Dynamically. – Vinod ram Jan 21 '16 at 14:18
  • Hive Dynamic Partition has certain limitation. It will not work in this case. User needs to add partition through code. @ Osiris- Can you put more detail about your problem to understand your question more. – Sandeep Singh Jan 22 '16 at 03:36
  • I think you can create the file inside the reduce function and Write on it (use the pure java or python code) without using Write() of hadoop but I didn't see before that we can perform a partition with a hole path like p1/p2/p3/part-* only outputdir/part-* – Imi.Cino Mar 20 '17 at 00:00

0 Answers0