Explanation
Use a different approach: Stream the file.
That is, don't load it into memory completely but read it byte per byte and simultaneously write it back byte per byte .
Get an InputStream
to the file, wrap some GZipInputStream
around. Read byte per byte, write to an OutputStream
.
The opposite if you want to compress an archive. Then you wrap the OutputStream
by some GZipOutputStream
.
Code
The example uses Apache Commons Compress but the logic of the code remains the same for all libraries.
Unpacking a gz
archive:
Path inputPath = Paths.get("archive.tar.gz");
Path outputPath = Paths.get("archive.tar");
try (InputStream fin = Files.newInputStream(inputPath );
OutputStream out = Files.newOutputStream(outputPath);) {
GZipCompressorInputStream in = new GZipCompressorInputStream(
new BufferedInputStream(fin));
// Read and write byte by byte
final byte[] buffer = new byte[buffersize];
int n = 0;
while (-1 != (n = in.read(buffer))) {
out.write(buffer, 0, n);
}
}
Packing as gz
archive:
Path inputPath = Paths.get("archive.tar");
Path outputPath = Paths.get("archive.tar.gz");
try (InputStream in = Files.newInputStream(inputPath);
OutputStream fout = Files.newOutputStream(outputPath);) {
GZipCompressorOutputStream out = new GZipCompressorOutputStream(
new BufferedOutputStream(fout));
// Read and write byte by byte
final byte[] buffer = new byte[buffersize];
int n = 0;
while (-1 != (n = in.read(buffer))) {
out.write(buffer, 0, n);
}
}
You could also wrap BufferedReader
and PrintWriter
around if you feel more comfortable with them. They manage the buffering themselves and you can read and write line
s instead of byte
s. Note that this only works correct if you read a file with lines and not some other format.