0

I get some very odd errors when using org.apache.commons.compress to read embedded archive files and I suspect it's my inexperience that is haunting me.

When running my code I get a variety of truncated zip file errors (along with other truncated file errors). I suspect it's my use of ArchiveInputStream

private final void handleArchive(String fileName, ArchiveInputStream ais) {
   ArchiveEntry archiveEntry = null;

   try {
      while((archiveEntry = ais.getNextEntry()) != null) {

         byte[] buffer = new byte[1024];

         while(ais.read(buffer) != -1) {
            handleFile(fileName + "/" + archiveEntry.getName(), archiveEntry.getSize(), new ByteArrayInputStream(buffer));
   } catch(IOException ioe) {
      ioe.printStackTrace();
   }
}

When I do this archiveEntry = ais.getNextEntry() does this effectively close my ais, and is there any way to read the bytes of embedded archive files using commons compress?

nullByteMe
  • 6,141
  • 13
  • 62
  • 99

1 Answers1

1

You re doing some wierd stuff it seems? For each archieve entry while your reading your archieve you re recursively calling your read archieve method which results in opening the next archieve while your parent code is still handling your previous archieve.

You should loop entirely through your archieve entry before handling any new archieve entry in your compressed file. Something like

ArArchiveEntry entry = (ArArchiveEntry) arInput.getNextEntry();
byte[] content = new byte[entry.getSize()];
LOOP UNTIL entry.getSize() HAS BEEN READ {
    arInput.read(content, offset, content.length - offset);
}

as stated in the examples on the apache site

Filip
  • 857
  • 8
  • 19
  • Is there a way to guess which archiver to use for a given file rather than creating objects for each of those archivers? – nullByteMe Aug 05 '13 at 19:26
  • it all depends on what you want to do. Do you want to read only 1 specific file in the archieve and if so, do you know which one? All files in the archieve can be accessed using the ArchiveEntry object. – Filip Aug 05 '13 at 19:31
  • I'm reading an entire directory of files which can include any type of archive. I want to open every archive and do something with its bytes (in another object). – nullByteMe Aug 05 '13 at 19:33
  • Well then you have to iterate over all the AchieveEntries, read the entire byte stream of each entry before you move on to the next entry as I suggested in my post. The type of archieve entry is defined by the file name's extention. Have a look at the example link i posted. There are various archieves stated there with their according implementation class. – Filip Aug 05 '13 at 19:37