I read Hadoop in Action and found that in Java
using MultipleOutputFormat
and MultipleOutputs
classes we can reduce the data to multiple files but what I am not sure is how to achieve the same thing using Python streaming
.
for example:
/ out1/part-0000
mapper -> reducer
\ out2/part-0000
If anyone knows, heard, done similar thing, please let me know