2

I am attempting to create a dynamic map reduce application that takes in dimensions from an external properties file. The main problem lies in the fact that the variables i.e. the key will be composite and may be of whatever numbers, e.g pair of 3 keys, pair of 4 keys, etc.

My Mapper:

public void map(AvroKey<flumeLogs> key, NullWritable value, Context context) throws IOException, InterruptedException{
    Configuration conf = context.getConfiguration();
    int dimensionCount = Integer.parseInt(conf.get("dimensionCount"));
    String[] dimensions = conf.get("dimensions").split(","); //this gets the dimensions from the run method in main

    Text[] values = new Text[dimensionCount]; //This is supposed to be my composite key

    for (int i=0; i<dimensionCount; i++){
        switch(dimensions[i]){

        case "region":  values[i] = new Text("-");
            break;

        case "event":  values[i] = new Text("-");
            break;

        case "eventCode":  values[i] = new Text("-");
            break;

        case "mobile":  values[i] = new Text("-");
        }
    }
    context.write(new StringArrayWritable(values), new IntWritable(1));

}

The values will have good logic later.

My StringArrayWritable:

public class StringArrayWritable extends ArrayWritable {
public StringArrayWritable() {
    super(Text.class);
}

public StringArrayWritable(Text[] values){
    super(Text.class, values);
    Text[] texts = new Text[values.length];
    for (int i = 0; i < values.length; i++) {
        texts[i] = new Text(values[i]);
    }
    set(texts);
}

@Override
public String toString(){
    StringBuilder sb = new StringBuilder();

    for(String s : super.toStrings()){
        sb.append(s).append("\t");
    }

    return sb.toString();
}
}

The error I am getting:

    Error: java.io.IOException: Initialization of all the collectors failed. Error in last collector was :class StringArrayWritable
    at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:414)
    at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:81)
    at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:698)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassCastException: class StringArrayWritable
    at java.lang.Class.asSubclass(Class.java:3165)
    at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:892)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:1005)
    at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:402)
    ... 9 more

Any help would be greatly appreciated.

Thanks a lot.

Pritish Kamath
  • 159
  • 1
  • 8

1 Answers1

1

You're trying to use a Writable object as the key. In mapreduce the key must implement the WritableComparable interface. ArrayWritable only implements the Writable interface.

The difference between the two is that the comaprable interface requires you to implement a compareTo method so that mapreduce is able to sort and group the keys correctly.

Binary Nerd
  • 13,872
  • 4
  • 42
  • 44
  • can you please explain what is the function of a compareTo() function does actually. Btw thanks a lot for helping a noob... :) – Pritish Kamath Aug 31 '16 at 12:40
  • Well mapreduce needs to know how you want the keys ordering and grouped. The Javadocs explain compare in detail: https://docs.oracle.com/javase/7/docs/api/java/lang/Comparable.html. There are lots of resources on the net to help you, including the hadoop source code: https://github.com/apache/hadoop/tree/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io. So you need to think about how you want your key to be sorted. – Binary Nerd Aug 31 '16 at 12:58