I'm using the apache commons compress example from another post on here to extract files from a tar but it is failing with:
java.io.IOException: Invalid file path.
This only happens with SOME of the vmware ova files I'm passing to it (which are tar files btw) but not with all the ova files; others work fine.
Here is the code:
public static void unTar(File tarFile, File dest) throws IOException {
dest.mkdir();
TarArchiveInputStream tarIn = null;
tarIn = new TarArchiveInputStream(
new BufferedInputStream(
new FileInputStream(
tarFile
)
)
);
TarArchiveEntry tarEntry = tarIn.getNextTarEntry();
// tarIn is a TarArchiveInputStream
while (tarEntry != null) {// create a file with the same name as the tarEntry
File destPath = new File(dest, tarEntry.getName());
System.out.println("working: " + destPath.getCanonicalPath());
if (tarEntry.isDirectory()) {
destPath.mkdirs();
} else {
destPath.createNewFile();
byte [] btoRead = new byte[1024];
BufferedOutputStream bout =
new BufferedOutputStream(new FileOutputStream(destPath));
int len = 0;
while((len = tarIn.read(btoRead)) != -1)
{
bout.write(btoRead,0,len);
}
bout.close();
btoRead = null;
}
tarEntry = tarIn.getNextTarEntry();
}
tarIn.close();
}
It looks like the problem is introduced at tarEntry.getName() at the point where it tries to set the value of destFile. From stepping through it with the debugger, destPath is picking up extra undisplayable characters plus the word "someone" in the path:
target/mybuildname-SNAPSHOT/extractedDirectory/<garbage characters>someone/test.ovf
For ova files that I can successfully untar, the value of desPath looks normal:
target/mybuildname-SNAPSHOT/extractedDirectory/test.ovf
The "someone" text is a decent clue since I see this text in both tar (ova) file headers when I view it with hexdump -C. However they're not in the same locations.
I sense the solution here has something to do with figuring out what the offset is where the filename is stored and reading from that specific offset. That's my best guess but I'm not very good with reading hex.
It's important to note that my goal is to read the ovf xml file inside the ova and that I don't control the creation of the ova's...so I can't fix the problem in the header beforehand. The ova files themselves are perfectly functional and I can also successfully untar them from the command line with tar -xvf test.ova. In fact if I re package the tar file from the command line, the above code will work.