0

I have a bunch of large compressed files that I want to concatenate. Problem is, the don't have newline characters at the end of the uncompressed version, so if I try to just cat them together and work on them compressed, the last line in one file is joined to the first line in the next file (which throws an error with the software I'm using). Just cat'ing them with a new-line inserted between each compressed file doesn't work as I think gzip detects the newline character and thinks everything after it is 'trailing garbage'.e.g.

for f in *.gz; do (cat "${f}"; echo) >> all.gz; done;
gzip -d all.gz 

gzip: all.gz: decompression OK, trailing garbage ignored

What I'd like to do is something like this:

unzip file1.gz | add a newline char| gzip the output >> output.gz

and then do the same with file2.gz, file3.gz, etc., etc.

Any suggestions?

GrahamE
  • 5
  • 1

1 Answers1

2

You don't need to decompress and recompress. Simply compress the one-byte new-line character with gzip, and concatenate that between your large gzip files.

echo | gzip > newline.gz
cat file1.gz newline.gz file2.gz newline.gz file3.gz ... > file.gz

It will be a 21-byte file you insert for each new-line, but since you said that your other files are large, that shouldn't matter.

Mark Adler
  • 101,978
  • 13
  • 118
  • 158