2

I am trying to untar the tar file to a map using Apache commons compress in Java. I am able to untar most of the tar files but few are failing with the below Exception. I am not sure what is causing the issue. is the tar file corrupted? I am able to untar the file using 7zip in windows but the same file failing when programatically untar it. I am using Appache commons-compress 1.18

java.io.IOException: Error detected parsing the header
at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:285)
at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextEntry(TarArchiveInputStream.java:552)

Caused by: java.lang.IllegalArgumentException: At offset 124, 12 byte binary number exceeds maximum signed long value
at org.apache.commons.compress.archivers.tar.TarUtils.parseBinaryBigInteger(TarUtils.java:213)
at org.apache.commons.compress.archivers.tar.TarUtils.parseOctalOrBinary(TarUtils.java:177)
at org.apache.commons.compress.archivers.tar.TarArchiveEntry.parseTarHeader(TarArchiveEntry.java:1283)
at org.apache.commons.compress.archivers.tar.TarArchiveEntry.parseTarHeader(TarArchiveEntry.java:1266)
at org.apache.commons.compress.archivers.tar.TarArchiveEntry.<init>(TarArchiveEntry.java:404)
at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:283)
... 25 more

below is my code

public static Map<String, byte[]> unTarToMap(byte[] b) throws IOException, ArchiveException {
        final Map<String, byte[]> untaredFiles = new HashMap<>();
        ByteArrayInputStream is = new ByteArrayInputStream(b);
        final TarArchiveInputStream debInputStream = (TarArchiveInputStream) new ArchiveStreamFactory().createArchiveInputStream("tar", is);
        TarArchiveEntry entry;
        while ((entry = (TarArchiveEntry) debInputStream.getNextEntry()) != null) {
            final ByteArrayOutputStream outputFileStream = new ByteArrayOutputStream();
            IOUtils.copy(debInputStream, outputFileStream);
            outputFileStream.close();
            untaredFiles.put(entry.getName(), outputFileStream.toByteArray());
        }
        debInputStream.close();
        return untaredFiles;
    }
sparker
  • 1,666
  • 4
  • 21
  • 35

1 Answers1

1

You might be hitting a limitation of Commons Compress. At offset 124 of its header a tar entry stores its size. Commons Compress tries to represent the size as a Java long which has got a maximum value which is pretty large (2^63-1) but in theory tar entries could be bigger.

Either you have got a tar archive with entries that big (7z should be able to tell you how big it thinks the entry is) or you are hitting a bug. There are lots of different dialects of tar and it is quite possible Commons Compress thinks your archive is using a specific dialect when it isn't. In that case it would be best to open a bug report with Apache Commons Compress at https://issues.apache.org/jira/projects/COMPRESS/ and - if possible - provide the archive causing the exception.

BTW the line numbers in your stack trace do not match Compress 1.18 so you are probably not using the version you think you are.

Stefan Bodewig
  • 3,260
  • 15
  • 22
  • my bad. I pasted the wrong stack trace. I was keep changing the version and testing it. But the exception is same. I think I pasted the exception of the version 1.15. But it happens with 1.18 as well. I will edit the question to change the version. And I cant share the tar file which causing the issue since it has confidential information. – sparker Jan 13 '19 at 11:17
  • You probably should still open an bug over in Commons Compress JIRA and maybe provide parts of the archive (like the 12 bytes that make up the size inside of the header causing the exception). – Stefan Bodewig Jan 14 '19 at 06:04