1

I am trying to run a job where each mapper 'type' recieves a different input file. I know there is a way to do this with Java using MultipleInputs class like so:

MultipleInputs.addInputPath(job,new Path(args[0]),TextInputFormat.class,CounterMapper.class);
MultipleInputs.addInputPath(job,new Path(args[1]),TextInputFormat.class,CountertwoMapper.class);

Where CounterMapper.class and CountertwoMapper.class are the respective mapper 'types'.

I am trying to achieve similar functionality with MrJob for Python or any other language that is not Java (please don't ask why!).

This image is similar to what I want to achieve.

Any help is appreciated.

John Vandenberg
  • 474
  • 6
  • 16

1 Answers1

0

I have found a way in which different mappers can be associated to a sing input path, this doesn't exactly answer your question but hope it helps you. In the link below

Using multiple mapper inputs in one streaming job on hadoop?