I am using Python and have to work on following scenario using Hadoop Streaming: a) Map1->Reduce1->Map2->Reduce2 b) I dont want to store intermediate files c) I dont want to install packages like Cascading, Yelp, Oozie. I have kept them as last option.
I already went through the same kind of discussion on SO and elsewhere but could not find an answer wrt Python. Can you please suggest.