0

I am creating compressed archives with tar and bzip2 using jarchivelib which utilizes org.apache.commons.compress.

try {
    Archiver archiver = ArchiverFactory.createArchiver(ArchiveFormat.TAR, CompressionType.BZIP2);
    File archive = archiver.create(archiveName, destination, sourceFilesArr);
} catch (IOException e) {
    e.printStackTrace();
}

Sometimes it can happen that the created file is corrupted, so I want to check for that and recreate the archive if necessary. There is no error thrown and I detected the corruption when trying to decompress it manually with tar -xf file.tar.bz2 (Note: extracting with tar -xjf file.tar.bz2 works flawlessly)

tar: Archive contains `\2640\003\203\325@\0\0\0\003\336\274' where numeric off_t value expected
tar: Archive contains `\0l`\t\0\021\0' where numeric mode_t value expected
tar: Archive contains `\003\301\345\0\0\0\0\006\361\0p\340' where numeric time_t value expected
tar: Archive contains `\0\210\001\b\0\233\0' where numeric uid_t value expected
tar: Archive contains `l\001\210\0\210\001\263' where numeric gid_t value expected
tar: BZh91AY&SY"'ݛ\003\314>\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\343\262\037\017\205\360X\001\210: Unknown file type `', extracted as normal file
tar: BZh91AY&SY"'ݛ�>��������������������������������������X�: implausibly old time stamp 1970-01-01 00:59:59
tar: Skipping to next header
tar: Exiting with failure status due to previous errors

Is there a way using org.apache.commons.compress to check a compressed archive if it is corrupted? Since the files can be at the size of several GB an approach without decompressing would be great.

bennos
  • 313
  • 4
  • 17

1 Answers1

2

As bzip2 compression produces a stream, there is no way how to check for corruption without decompressing that stream and passing it to tar to check.

Anyway, in your case you actually decompress directly with tar and not passing first to bzip2. This is the root cause. You need to always use the -j flag to tar as it's compressed by bzip2. That's why the second command works correctly.

Zbynek Vyskovsky - kvr000
  • 18,186
  • 3
  • 35
  • 43
  • I have hundreds of files which decompressed with `-xf` flags without problem. Only one made trouble. This, and for other failure prevention, I wanted to add a check for corruption. Nevertheless, my first reaction was to add the `-j` flag to my uncompress work-flow. – bennos Mar 17 '16 at 09:57
  • @bennos : Sure, it depends whether the file is compressed with `bzip2` or not. If yes, you have to provide `-j` flag as this does the `bzip2` decompression. – Zbynek Vyskovsky - kvr000 Mar 17 '16 at 10:06
  • I meant decompressing hundreds of `bzip2` without the `-j` flag worked. I guess `tar` can identify the compression. Maybe this one had an error in the magic byte/header. – bennos Mar 17 '16 at 10:10