-1

I am running a mapreduce code, an error I am getting is

    Error: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.IntWritable
        at test.temp$Mymapper.map(temp.java:1)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

The code is given below:

    package test;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
//import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;


public class temp {
    public static class Mymapper extends Mapper<Object, Text, IntWritable,Text> {

        public void map(Object key, Text value,Context context) throws IOException, InterruptedException{

            int month=Integer.parseInt(value.toString().substring(17, 19));
            IntWritable mon=new IntWritable(month);
            String temp=value.toString().substring(27,31);
            String t=null;
            for(int i=0;i<temp.length();i++){
                if(temp.charAt(i)==',')
                        break;

                else
                    t=t+temp.charAt(i);
            }

            Text data=new Text(value.toString().substring(22, 26)+t);
            context.write(mon, data);
        }


    }

    public static class Myreducer extends  Reducer<IntWritable,Text,IntWritable,IntWritable> {

        public void reduce(IntWritable key,Iterable<Text> values,Context context) throws IOException, InterruptedException{
            String temp="";
            int max=0;
            for(Text t:values)
            {
                temp=t.toString();
                if(temp.substring(0, 4)=="TMAX"){

                    if(Integer.parseInt(temp.substring(4,temp.length()))>max){
                        max=Integer.parseInt(temp.substring(4,temp.length()));
                    }
                }
            }

            context.write(key,new IntWritable(max));
        }



        }



    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        Job job = Job.getInstance(conf, "temp");
        job.setJarByClass(temp.class);
        job.setMapperClass(Mymapper.class);
        job.setCombinerClass(Myreducer.class);
        job.setReducerClass(Myreducer.class);
        job.setOutputKeyClass(IntWritable.class);
        job.setOutputValueClass(IntWritable.class);

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        job.waitForCompletion(true);

        }

}

and the input file is

USC00300379,19000101,TMAX,-78,,,6, USC00300379,19000101,TMAX,-133,,,6, USC00300379,19000101,TMAX,127,,,6

kindly reply and help please!

harsh mehta
  • 7
  • 1
  • 2

4 Answers4

0

Think you're using TextInputFormat as input format for the job. That produces LongWritable/Text and Hadoop is deriving the map-output classes from that.

Try setting the map output classes explicitly and remove the combiner:

job.setMapOutputKeyClass(IntWritable.class);
job.setMapOutputValueClass(Text.class);
// job.setCombinerClass(Myreducer.class);

The combiner will only work if map and reduce output is compatible!

oae
  • 1,513
  • 1
  • 17
  • 23
  • Ok, updated my answer. You also need to set the map output value class and unset the combiner. With those changes i got your code completes successfully! – oae Jan 05 '16 at 12:05
  • i did that changes, its still giving the same error @oae , are you sure it ran successfully ? – harsh mehta Jan 05 '16 at 12:40
  • Yes, i copied your code, added the 2 lines setting the key and value class and removed the combiner class! Can you carefully look if its exact the same message ? Because until i had the fix complete i received varying error messages which looked the same but looking twice i recognised they were slightly different! – oae Jan 05 '16 at 14:49
0

You have set the following in your driver:

job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);

This means, both your mapper and reducer output key class should be IntWritable and value class should be IntWritable.

The reducer is good:

public static class Myreducer extends  Reducer<IntWritable,Text,IntWritable,IntWritable> 

Here both the output key and value are IntWritable.

Problem is with the mapper:

public static class Mymapper extends Mapper<Object, Text, IntWritable,Text> 

Here the output key class is IntWritable. But, output value class is Text (It is expected to be IntWritable).

If your mapper's output key/value classes are different from your reducer's output key/value class, then you need to explicitly add following statements to your driver:

setMapOutputKeyClass();
setMapOutputValueClass();

Make following changes in your code:

  • Set map output key and value class: In your case, since your mapper and reducer output key and value classes are different, you need to set the following:

    job.setMapOutputKeyClass(IntWritable.class);
    job.setMapOutputValueClass(Text.class);
    
    job.setOutputKeyClass(IntWritable.class);
    job.setOutputValueClass(IntWritable.class);
    
  • Disable combiner: Since you are using your Reducer code for your Combiner, the output of Combiner will be Intwritable and IntWritable. But, the Reducer is expecting the input as IntWritable and Text. Hence, you will get the following exception, because it got the value as IntWritable instead of Text:

    Error: java.io.IOException: wrong value class: class org.apache.hadoop.io.IntWritable is not class org.apache.hadoop.io.Text
    

    To remove this error, you need to disable the Combiner:

    job.setCombinerClass(Myreducer.class);
    
  • Don't use reducer as a combiner: If you definitely need to use a combiner, then write a combiner whose output key/value are IntWritable and Text.

Manjunath Ballur
  • 6,287
  • 3
  • 37
  • 48
0

When you set the below in your driver,

job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);

it defines the output classes for mapper and reducer, not just the reducer.

That means your mapper should have connect.write(IntWritable, IntWritable), but you have coded connect.write(IntWritable, Text).

Fix: When your map output types are not same as reduce output , you need to explicitly set the output types for mapper. So, add the below in your driver code.

job.setMapOutputKeyClass(IntWritable.class);
job.setMapOutputValueClass(Text.class);
PonMaran
  • 66
  • 3
0

This is the change i have made.

public static void main(String[] args) throws Exception {

        Configuration conf = new Configuration();
        Job job = Job.getInstance(conf, "temp");

        job.setJarByClass(Temp.class);

        job.setMapperClass(Mymapper.class);
        job.setReducerClass(Myreducer.class);

        job.setMapOutputKeyClass(IntWritable.class);
        job.setMapOutputValueClass(Text.class);

        job.setOutputKeyClass(IntWritable.class);
        job.setOutputValueClass(IntWritable.class);

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        job.setNumReduceTasks(1);
        job.waitForCompletion(true);
    }

Output: 10 0

For explanation follow Manjunath Ballur's post.

srikanth
  • 958
  • 16
  • 37