0

We have the following Java method to compress files using GZIPOutputStream

 private void archive(Path originalFile) {
    Path tempFile = originalFile.resolveSibling(originalFile.toFile().getName() + TEMPORARY_FILE_EXTENSION);
    Path gzippedFile = originalFile.resolveSibling(originalFile.toFile().getName() + ARCHIVED_FILE_EXTENSION);
    try {
        try (FileInputStream input = new FileInputStream(originalFile.toFile());
            BufferedOutputStream output = new BufferedOutputStream(new GZIPOutputStream(new FileOutputStream(tempFile.toFile())))) {
            IOUtils.copy(input,output);
            output.flush();
        }
        Files.move(tempFile, gzippedFile, StandardCopyOption.REPLACE_EXISTING);
        Files.delete(originalFile);
        LOGGER.info("Archived file {} to {}", originalFile, gzippedFile);
    } catch (IOException e) {
        LOGGER.error("Could not archive file {}: " + e.getMessage(), originalFile, e);
    }
    try {
        Files.deleteIfExists(tempFile);
    } catch (IOException e) {
        LOGGER.error("Could not delete temporary file {}: " + e.getMessage(), tempFile, e);
    }
}

The problem is that if we manually decompress back the file:

gzip -d file_name

The resulting decompressed file does not match the original file. The file size and the total number of lines are decreased. For example from 33MB to 32MB with a loss of 800K lines.

Could the issue be related with the encoding (EBCDIC) of the files we are compressing? https://en.wikipedia.org/wiki/EBCDIC

Kuikiker
  • 156
  • 13
  • 1
    @JonSkeet Correct me if I’m wrong, but it appears the OutputStream is already part of the try-with-resources statement. – VGR Mar 12 '19 at 14:28
  • 1
    He's perfectly closing the streams. Your problem must be on another layer as it seems. I tried your code and it performed flawlessly. Do you have enough space on the volume? – SirFartALot Mar 12 '19 at 14:30
  • @VGR: You're absolutely right. I'd missed the location of the braces. (Obviously the scrolling doesn't help.) Will delete my first comment to reduce confusion. – Jon Skeet Mar 12 '19 at 15:02
  • @SirFartALot have created some Unit Tests and indeed I cannot reproduce the described issue. I will try to check if there were space issues when the error happened. – Kuikiker Mar 12 '19 at 15:28

1 Answers1

0

After several Tests we have not been able to reproduce the issue, it must have been related with not having enough space on the volume during the compression. @SirFartALot thanks for pointing that out.

Kuikiker
  • 156
  • 13