1

I'm running pig script from Java using org.apache.pig.PigServer class. I need to output my files in sequence format compressed in gz. this is what I've done:

effectivePigProperties.put("mapred.output.compress", "true");
effectivePigProperties.put("mapred.output.format.class", "org.apache.hadoop.mapred.SequenceFileOutputFormat");
effectivePigProperties.put("mapred.output.compression.type", "SequenceFile.CompressionType.BLOCK");
effectivePigProperties.put("mapred.output.compression.codec", "org.apache.hadoop.io.compress.GzipCodec");

the output is in gz but not sequenced files. what am i missing?

ryuchtman
  • 176
  • 8
  • duplicate question: http://stackoverflow.com/questions/2423949/storing-data-to-sequencefile-from-apache-pig – octo Oct 12 '12 at 20:01

1 Answers1

0

While not present in the Apache Pig bundle (nor its Piggybank) yet, Twitter's Elephant Bird library provides a SequenceFileStorage implementation you can make use of.

Harsh J
  • 1,494
  • 13
  • 26