I have successfully read pdf file in tar.gz format folder . But I faced performance issue - taking more time to open tar.gz folder containing more than 1000 small pdf files each file size 10 - 25 MB .Total size of the folder is 2GB
How to improve the performance of unzip file reading ?
FileInputStream fin = new FileInputStream(tarName);
BufferedInputStream in = new BufferedInputStream(fin);
GzipCompressorInputStream gzIn = new GzipCompressorInputStream(in);
TarArchiveInputStream tarIn = new TarArchiveInputStream(gzIn);
TarArchiveInputStream tarIn1 = new TarArchiveInputStream(tarIn);
TarArchiveEntry entry = null;
byte[] buffer = new byte[5024];
int nrBytesRead;
while ((entry = (TarArchiveEntry) tarIn1.getNextEntry()) != null) {
System.out.println("it finds a file "
+ entry.getName().toString());
if (entry.getName().toString().equals(fileName)) {
while ((nrBytesRead = tarIn1.read(buffer)) > 0) {
out.write(buffer, 0, nrBytesRead);
}
break;
}
}