0

I am trying to split a string using mapreduce2(yarn) in Hortonworks Sandbox. It throws a ArrayOutOfBound Exception if I try to access val[1] , Works fine with when I don't split the input file.

Mapper:

public class MapperClass extends Mapper<Object, Text, Text, Text> {

    private Text airline_id;
    private Text name;
    private Text country;
    private Text value1;

    public void map(Object key, Text value, Context context)
            throws IOException, InterruptedException {

        String s = value.toString();
        if (s.length() > 1) {

            String val[] = s.split(",");
            context.write(new Text("blah"), new Text(val[1]));
        }


    }
}

Reducer:

public class ReducerClass extends Reducer<Text, Text, Text, Text> {

private Text result = new Text();

public void reduce(Text key, Iterable<Text> values, Context context)
        throws IOException, InterruptedException {

    String airports = "";

    if (key.equals("India")) {
        for (Text val : values) {
            airports += "\t" + val.toString();
        }
        result.set(airports);
        context.write(key, result);
    }
}
}

MainClass:

public class MainClass {

public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {

    Configuration conf = new Configuration();
    @SuppressWarnings("deprecation")
    Job job = new Job(conf, "Flights MR");

    job.setJarByClass(MainClass.class);
    job.setMapperClass(MapperClass.class);
    job.setReducerClass(ReducerClass.class);

    job.setNumReduceTasks(0);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(Text.class);

    job.setInputFormatClass(KeyValueTextInputFormat.class);

    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));

    System.exit(job.waitForCompletion(true) ? 0 : 1);

}

}

Can you help?

Update:

Figured out that it doesn't convert Text to String.

Sundari
  • 33
  • 7

1 Answers1

0

If the string you are splitting does not contain a comma, the resulting String[] will be of length 1 with the entire string in at val[0].

Currently, you are making sure that the string is not the empty string

if (s.length() > -1)

But you are not checking that the split will actually result in an array of length more than 1 and assuming that there was a split.

context.write(new Text("blah"), new Text(val[1]));

If there was no split this will cause an out of bounds error. A possible solution would be to make sure that the string contains at least 1 comma, instead of checking that it is not the empty string like so:

String s = value.toString();
if (s.indexOf(',') > -1) {

    String val[] = s.split(",");
    context.write(new Text("blah"), new Text(val[1]));
}
Hangman4358
  • 411
  • 1
  • 4
  • 11