0

I have implemented a mapreduce operation for log file using amazon and hadoop with custom jar.

My output shows the correct keys and values, but all the records are being displayed in a single line. For example, given the following pairs:

<1387, 2>
<1388, 1>

This is what's printing:

1387     21388     1

This is what I'm expecting:

1387     2
1388     1

How can I fix this?

  • Can you please show the code where you print out the records? As it stands, we have no clue what you're doing! (My guess is that you're missing adding a newline character) – Chris Forrence Sep 11 '14 at 16:30
  • public static void main(String[] args) throws Exception { JobConf conf = new JobConf(LogAnalyzer.class); conf.setJobName("Loganalyzer"); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(LogAnalyzer.Map.class); conf.setCombinerClass(LogAnalyzer.Reduce.class); conf.setReducerClass(LogAnalyzer.Reduce.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); conf.set("mapreduce.textoutputformat.separator", "--"); – Deepa Sadagopan Sep 13 '14 at 13:28
  • FileInputFormat.setInputPaths(conf, new Path(args[0])); FileOutputFormat.setOutputPath(conf, new Path(args[1])); JobClient.runJob(conf); } – Deepa Sadagopan Sep 13 '14 at 13:31
  • Reduce function : public static class Reduce extends MapReduceBase implements Reducer { public void reduce(Text key, Iterator values, OutputCollector output, Reporter reporter) throws IOException { int sum = 0; while (values.hasNext()) { sum += values.next().get(); } output.collect(key, new IntWritable(sum)); } } – Deepa Sadagopan Sep 13 '14 at 13:32
  • Map function: public void map(LongWritable key, Text value, OutputCollector output, Reporter reporter) throws IOException { String line = ((Text) value).toString(); Matcher matcher = p.matcher(line); if (matcher.matches()) { String timestamp = matcher.group(4); minute.set(getMinuteBucket(timestamp)); output.collect(minute, ONE); //context.write(minute, one); } } – Deepa Sadagopan Sep 13 '14 at 13:32

1 Answers1

0

Cleaned up your code for you :)

public static void main(String[] args) throws Exception {   
    JobConf conf = new JobConf(LogAnalyzer.class);
    conf.setJobName("Loganalyzer");
    conf.setOutputKeyClass(Text.class);
    conf.setOutputValueClass(IntWritable.class);
    conf.setMapperClass(LogAnalyzer.Map.class);
    conf.setCombinerClass(LogAnalyzer.Reduce.class);
    conf.setReducerClass(LogAnalyzer.Reduce.class);
    conf.setInputFormat(TextInputFormat.class);
    conf.setOutputFormat(TextOutputFormat.class);
    conf.set("mapreduce.textoutputformat.separator", "--");
    FileInputFormat.setInputPaths(conf, new Path(args[0])); 
    FileOutputFormat.setOutputPath(conf, new Path(args[1]));
    JobClient.runJob(conf); 
}

public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> { 
    public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { 
        int sum = 0; 
        while (values.hasNext()) {
            sum += values.next().get(); 
        }
        output.collect(key, new IntWritable(sum)); 
    } 
}


public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
    String line = ((Text) value).toString(); 
    Matcher matcher = p.matcher(line); 
    if (matcher.matches()) { 
        String timestamp = matcher.group(4); 
        minute.set(getMinuteBucket(timestamp)); 
        output.collect(minute, ONE); //context.write(minute, one); 
    }
}

This isn't hadoop-streaming, it's just a normal java job. You should amend the tag on the question.

This looks okay to me, although you don't have the mapper inside a class, which I assume is a copy/paste omission.

With regards to the line endings. I don't suppose you are looking at the output on Windows? It could be a problem with unix/windows line endings. If you open up the file in sublime or another advanced text editor you can switch between unix and windows. See if that works.

Nonnib
  • 468
  • 3
  • 11