0

I am using commons-compress to process tarball files and noticed that even files which are not tar seem to be processed. Why is this -- is there a better library to detect valid tar files

 <dependency>
      <groupId>org.apache.commons</groupId>
      <artifactId>commons-compress</artifactId>
      <version>1.20</version>
 </dependency>

bug689.csv is a CSV file, the test fails because apparently te.isFile() returns true. te.getName() seems to return the contents of the CSV. Is this a bug of am I using the package incorrectly -- I'd expect the InputStream to not be successfully converted to TarArchiveEntry

    @Test
    public void testTarball() throws IOException{
        InputStream tarData = this.getClass().getResourceAsStream("/bug689.csv");
        TarArchiveInputStream tis = new TarArchiveInputStream(tarData);
        TarArchiveEntry te = tis.getNextTarEntry();
        assertFalse(te.isFile());
    }

Yana K.
  • 1,926
  • 4
  • 19
  • 27
  • If I run your code on a csv file, then `tis.getNextTarEntry()` returns `null`. If I run it on a tar file (which happens to have a csv file suffix), and which contains a regular csv file, then `te.isFile()` returns `true`. All as I would expect. Are you absolutely sure `bug689.csv` is an uncompressed file (forgive me for asking)? – andrewJames Feb 20 '20 at 20:59

1 Answers1

0

If you are not dealing with a tar file, then tis.getNextTarEntry() will be null - so you would have to check for that explicitly.

But if you do have a valid tar file, beware relying on te.isFile(). The first item in your tar may not be a regular file. It may be a directory or something else.

The tar file may even be empty - in which case tis.getNextTarEntry() will again be null.

If you want to only test for a tar containing one regular file, then I see no issue with using te.isFile().

andrewJames
  • 19,570
  • 8
  • 19
  • 51
  • I would expect it to be null...but it's not -- this was my first surprise. Not only is it not null, it seems to think the conent is a valid file?! I pushed a sample repo here: https://github.com/yanakad/commons-compress-test. If you have a version that produces null for `tis.getNextTarEntry()` I would much appreciate advice on what is different – Yana K. Feb 24 '20 at 13:49
  • I'm sorry - I can't recreate the problem you are having. Your github code, with no changes, works fine for me. That is strange. – andrewJames Feb 24 '20 at 14:43
  • Thank you, you've actually been very helpful. I was seeing strange results but was only running via an IDE. After your comment I force wiped the local maven cache and ran through command line. I finally get an NPE – Yana K. Feb 25 '20 at 17:18
  • Glad I could "[help](https://en.wikipedia.org/wiki/Rubber_duck_debugging)". Thanks for sharing the solution, though - appreciate it. – andrewJames Feb 25 '20 at 17:49