I have a MR job which uses multipleoutput format and outputs 500 files. I want to zip those files without merging them.
Asked
Active
Viewed 89 times
1 Answers
0
You have to use SequenceFileOutputFormat
: An OutputFormat that writes keys, values to SequenceFiles in binary(raw) format
You can have three variations in SequenceFile.CompressionType
BLOCK : Compress sequences of records together in blocks.
NONE : Do not compress records.
RECORD: Compress values only, each separately.
Key changes in your code.
Path outDir = new Path(WORK_DIR_PREFIX + "/out/" + jobName);
job.setOutputFormatClass(SequenceFileOutputFormat.class);
SequenceFileOutputFormat.setOutputPath(job, outDir);
SequenceFileOutputFormat.setOutputCompressionType(job, CompressionType.BLOCK);
Have a look at working example on usage of SequenceFileOutputFormat.

Ravindra babu
- 37,698
- 11
- 250
- 211