You can get all lines of a file , in the reducer task. If it solves your issue , please look:
public class FileLineComparison {
public static class Map extends
Mapper<LongWritable, Text, Text, Text> {
private Text fileName = new Text();
public void map(LongWritable key, Text line, Context context)
throws IOException, InterruptedException {// Parse the input string into a nice map
/*
* get file name from context and put it as key,
* so that reducer will get all lines of that file
* from one or more mappers
*/
FileSplit fileSplit = (FileSplit)context.getInputSplit();
fileName.set( fileSplit.getPath().getName());
context.write(fileName, line);
}
}
public static class Reduce extends
Reducer<Text, Text, Text, Text> {
public void reduce(Text filename, Iterable<Text> allLinesOfsinglefile, Context context) throws IOException, InterruptedException {
for (Text val : allLinesOfsinglefile) {
/*
* you get each line of the file here.
* if you want to compare each line with the rest, please loop again.
But in that case consider it as an iterable object
* do your things here
*/
}
/*
* write to out put file, if required
*/
context.write(filename, filename);
}
}
}
Or if you really need it in mapper, please read the file itself in each mapper, since filename and path we got from split
.It is only reccomended when file size is small..