3

I'm using two mappers and two reducers. I'm getting the following error:

java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.Text

This is because the first reducer writes <Text, IntWritable> and my second mapper is getting <Text,IntWritable> but, as i read, mappers take <LongWritable, Text> by default.

So, i have to set the input format with something like:

job2.setInputFormatClass(MyInputFormat.class);

Is there a way to set the InputFormat class to receive <Text,IntWritable>?

Hernan
  • 1,149
  • 1
  • 11
  • 29

2 Answers2

2

You don't need your own input format. All you need is to set SequenceFileOutputFormat for the first job and SequenceFileInputFormat for the second job.

TextInputFormat uses LongWritable keys and Text values, but SequenceFileInputFormat uses whatever types you used to store the output.

vefthym
  • 7,422
  • 6
  • 32
  • 58
2

The input types to your mapper are set by the InputFormat as you suspect.

Generally when you're chaining jobs together like this, its best to use SequenceFileOutputFormat and in the next job SequenceFileInputFormat. This way the types are handled for you and you set the types to be the same, ie the mappers inputs are the same as the previous reducers outputs.

Binary Nerd
  • 13,872
  • 4
  • 42
  • 44
  • 1
    we practically posted the same answer, at the same minute, 1.5 hour after the question :) – vefthym Oct 24 '16 at 07:06
  • Thanks for answering. But even thought i had changed the input and output format classes to SequenceFileInputFormat and SequenceFileOutputFormat it still throws an error, expected intwritable received text. I've asked a [new answer](http://stackoverflow.com/questions/40221115/how-to-set-a-a-reducer-to-emmit-text-intwritable-and-a-mapper-to-receive-tex) because i want to involve some code, i think this answer has been perfectly responded. – Hernan Oct 24 '16 at 16:29