0

My Mapper implementation

public class SimpleMapper extends Mapper<Text, Text, Text, MapWritable> {

@Override
protected void map(Text key, Text value,Context context)
        throws IOException, InterruptedException {

            MapWritable writable = new LinkedMapWritable();
            writable.put("unique_key","one");
            writable.put("another_key","two");
            context.write(new Text("key"),writable );
        }

}

And the Reducer implementation is:

public class SimpleReducer extends Reducer<Text, MapWritable, NullWritable, Text> {
@Override
protected void reduce(Text key, Iterable<MapWritable> values,Context context)
        throws IOException, InterruptedException {

            // The map writables have to be ordered based on the "unique_key" inserted into it
        }

}

Do I have to use secondary sort? Is there any other way to do so?

Sambit Tripathy
  • 434
  • 1
  • 4
  • 14

1 Answers1

1

MapWritable (values) in reducer are always in unpredictable order, this order may vary from run to run, and you have no control over it.

But what Map/Reduce paradigm guarantees is that the key presented to reducer would be in sorted order and all the values belonging to a single key would go to a single reducer.

So you can definitely use secondary sort and custom partitioner for your use case.

dpsdce
  • 5,290
  • 9
  • 45
  • 58