I ran one hadoop job which has generated multiple .deflate files. Now these files are stored on S3. So, i cannot run hadoop fs -text /somepath
command it will take the hdfs path. Now, i want to convert multiple files stored on s3 in .deflate format into one gzip file.
Asked
Active
Viewed 515 times
-1

Naresh
- 5,073
- 12
- 67
- 124
1 Answers
-1
If you make gzip files instead, using the GzipCodec, you can simply concatenate them to make one large gzip file.
You can wrap a deflate stream with a gzip header and trailer, as described in RFC 1952. A fixed 10-byte header, and an 8-byte trailer that is computed from the uncompressed data. So you will need to decompress each .deflate stream in order to compute its CRC-32 and uncompressed length to put in the trailer.

Community
- 1
- 1

Mark Adler
- 101,978
- 13
- 118
- 158
-
but, i already have files in .deflate format. So, how do i convert them to gzip that is my question. – Naresh Dec 04 '14 at 08:28