-1

I ran one hadoop job which has generated multiple .deflate files. Now these files are stored on S3. So, i cannot run hadoop fs -text /somepath command it will take the hdfs path. Now, i want to convert multiple files stored on s3 in .deflate format into one gzip file.

Naresh
  • 5,073
  • 12
  • 67
  • 124

1 Answers1

-1

If you make gzip files instead, using the GzipCodec, you can simply concatenate them to make one large gzip file.

You can wrap a deflate stream with a gzip header and trailer, as described in RFC 1952. A fixed 10-byte header, and an 8-byte trailer that is computed from the uncompressed data. So you will need to decompress each .deflate stream in order to compute its CRC-32 and uncompressed length to put in the trailer.

Community
  • 1
  • 1
Mark Adler
  • 101,978
  • 13
  • 118
  • 158
  • but, i already have files in .deflate format. So, how do i convert them to gzip that is my question. – Naresh Dec 04 '14 at 08:28