I have to read all contents of a fileA and pass it to map function. In map function, key is fileB and value is the contents of fileA. In outputFormat recordReader, I am appending all the values (all contents of FileA) to fileB using sequence file writer append method. The problem is that
1. I am loading all file contents in inputFormat recordReader and passing it to single map function.
2. Appending all contents in sequence file.
PseudoCode:
InputFormat RecordReader:
@Override
public boolean nextKeyValue() throws IOException, InterruptedException {
if(flag>0)
return false;
flag++;
String re=read all contents of file
String key= k1;
allRecords = new TextArrayWritable(Text.class, new Text[] {new Text(key),
new Text(re)});
return true;
}
@Override
public TextArrayWritable getCurrentValue() throws IOException, InterruptedException {
return allRecords;
}
Map Function:
protected void map(Text key, TextArrayWritable value,
Context context) throws IOException,
InterruptedException {
context.write(new Text(fileA path),value);
}
OutputFormat RecordWriter:
@Override
public void write(Text fileDir, TextArrayWritable contents) throws IOException,
InterruptedException {
SequenceFileWriter.append(contents.get()[0], contents.get()[1]);
}
Both the operations are in memory operations and might throw out of memory error, if file size is too large. Is there any way to avoid loading the entire contents in memory and able to append it to sequence file?