0

For example in map reduce program, I have given number of reduce tasks as 3 and custom Partitioner returns value 5 for a condition then what will happen?

Its a Question which may be silly please clarify me

Thanks in advance

maxshuty
  • 9,708
  • 13
  • 64
  • 77

2 Answers2

0

There are two sides to your question.

If partitions is less than reducers, then reducers get wasted. So you re not utilizing them fully.

If partitions are more than reducer, then the record gets thrown away, as per Hadoop Definitive Guide. Means no reducer would pick it up, and it is gone.

Ramzy
  • 6,948
  • 6
  • 18
  • 30
  • (1) As per the comment which you given there might be a mistake which you have typed INSTEAD OF PARTITION IS LESS THAN NUMBER OF REDUCE TASKS YOU MIGHT BE TYPED AS PARTITION GREATER THAN NUMBER OF REDUCE TASKS IN SECOND LINE am i correct? (2) If partition greater than number of reduce tasks then the record gets Thrown away means as per Definitive guide means, Will it throw error? – user1932624 Oct 28 '15 at 13:44
0

If the reducer number returned by partitioner is not available, those records would be thrown away. So do not play around with custom partitioner.

Have a look at error free solution.

InputSampler.Sampler<IntWritable, Text> sampler =
    new InputSampler.RandomSampler<IntWritable, Text>(0.1, 100);
InputSampler.writePartitionFile(conf, sampler);
conf.setPartitionerClass(TotalOrderPartitioner.class);

Have a look at this article for more details on partitioning

Ravindra babu
  • 37,698
  • 11
  • 250
  • 211