Sure, basically the same way I told in the other thread you've linked. But you have to implement your own Mapper
.
Just a quick scratch for you:
public class LongLongMapper extends
Mapper<LongWritable, Text, LongWritable, LongWritable> {
@Override
protected void map(LongWritable key, Text value,
Mapper<LongWritable, Text, LongWritable, LongWritable>.Context context)
throws IOException, InterruptedException {
// assuming that your line contains key and value separated by \t
String[] split = value.toString().split("\t");
context.write(new LongWritable(Long.valueOf(split[0])), new LongWritable(
Long.valueOf(split[1])));
}
public static void main(String[] args) throws IOException,
InterruptedException, ClassNotFoundException {
Configuration conf = new Configuration();
Job job = new Job(conf);
job.setJobName("Convert Text");
job.setJarByClass(LongLongMapper.class);
job.setMapperClass(Mapper.class);
job.setReducerClass(Reducer.class);
// increase if you need sorting or a special number of files
job.setNumReduceTasks(0);
job.setOutputKeyClass(LongWritable.class);
job.setOutputValueClass(LongWritable.class);
job.setOutputFormatClass(SequenceFileOutputFormat.class);
job.setInputFormatClass(TextInputFormat.class);
FileInputFormat.addInputPath(job, new Path("/input"));
FileOutputFormat.setOutputPath(job, new Path("/output"));
// submit and wait for completion
job.waitForCompletion(true);
}
}
Each value in your mapper function will get a line of your input, so we are just splitting it by your delimiter (tab) and parsing each part of it into longs.
That's it.