-1

zip4j is a great library. But i run into a problem when using it in a class that uses a thread. The zip4j method is called from a class that implements thread and sometimes (not always) it leaves files uncompress and somtimes there are leftofer files with the extension *.zip345. Also the process returns net.lingala.zip4j.exception.ZipException: cannot rename modified zip file.

The method zip4jProcess is called from the class public method. Class name is: SZipInterface.class

The SZipInterface.class is initialized in the thread class ex: ThreadObj.class and instantiated per thread. No static method is used.

What is the cause of the problems? How do you fix it? Is zip4j thread safe?

Method:

    private int zip4jProcess() {
    int status = 0;
    if (null != getInFiles() && getInFiles().length > 0) {
        for (String file : getInFiles()) {
            File sourceFile = new File(file);
            ZipFile zipFile = null;
            ZipParameters zipParams = new ZipParameters();
            if (getPassword() != null
                    && !getPassword().trim().equalsIgnoreCase("")) {
                zipParams.setPassword(getPassword());
                zipParams.setEncryptFiles(true);
                zipParams
                        .setEncryptionMethod(Zip4jConstants.ENC_METHOD_STANDARD);

            }
            zipParams
                    .setCompressionLevel(Zip4jConstants.DEFLATE_LEVEL_NORMAL);

            if (sourceFile.exists()) {
                try {
                    zipFile = new ZipFile(getZipFileName());
                    if (zipFile.getFile().exists()) {
                        zipFile.addFile(sourceFile, zipParams);
                        if (log.isDebugEnabled()) {
                            log.debug("Adding: " + sourceFile.getName()
                                    + " to " + zipFile.getFile().getName()
                                    + " Pass: " + getPassword());
                        }
                    } else {
                        zipFile.createZipFile(sourceFile, zipParams);
                        if (log.isDebugEnabled()) {
                            log.debug("Creating: " + sourceFile.getName()
                                    + " to " + zipFile.getFile().getName()
                                    + " Pass: " + getPassword());
                        }
                    }
                } catch (ZipException e) {
                    log.error(e);
                    status = 1;
                }
            }
        }
    }

    return status;
}
Laurentiu L.
  • 6,566
  • 1
  • 34
  • 60
Mohammad Irfan
  • 101
  • 3
  • 12
  • Is there a reason you've come up with such a construct, instead of using a single thread for the zipping? Note that you could write multi-threaded code where each thread would pack their own zip file. In your code however it seems your shared resource is coming from `getInFiles()` and your code is not thread safe. – Kayaman Jun 16 '15 at 12:33
  • The reason using multithread is to get faster result to produce multiple files. Can I just use synchronized keyword on the method? – Mohammad Irfan Jun 16 '15 at 13:01
  • How many zip files are you creating? Multi-threading isn't a magical "make things go faster" trick, as you're seeing. `Synchronized` is also not magical. I recommend using a single thread and learning how to properly implement multi-threaded programs (and where it makes sense, because in your program it doesn't make sense). – Kayaman Jun 16 '15 at 13:02
  • about 130 zip files divided to 7 thread. Each file has unique file name after zip. It should not overlap the process. I suspect the overlapping is happened because zip4j using or writing to the same temporary file. – Mohammad Irfan Jun 16 '15 at 13:04
  • Well you can read the sources to see how it handles the temporary files. – Kayaman Jun 16 '15 at 13:06
  • Have been tracing it, but i'm a little noob so I can't completely grasp it. But the goal is not modifying the library itself, but how to make sure the process is run safely in thread. My temporary fix is to add synchronized keyword to the method. With that I hope that the method is run serialy. – Mohammad Irfan Jun 17 '15 at 01:19
  • 1
    Instead of making the method synchronized, why don't you just make the packing single threaded? You'll get rid of unnecessary complexity (i.e. extra threads that you've now noticed are useless to you) at the same time, it's a win-win situation. – Kayaman Jun 17 '15 at 05:04

1 Answers1

0

I believe the times where you have leftovers or uncomprossed files may be when multiple threads try to use the same zip file (probably at zipFile.addFile(...)).

So try handling the addFile differently with concurrency in mind.

Their support forum said it's tricky and not currently supported - see the link for the limitations of doing it.

This can be quite tricky to implement, if not impossible to achieve, especially when using encryption or when compressing the file (and not just using the store method, which just copies the source file to the zip without any compression). A current block of file being compressed/decompressed depends on the previous block. So, if multiple threads were to read or write, these threads cannot do this process simultaneously, but have to wait until the block n-1 (if n is the current block) is read/wrote. So, its as good as running the process in the same thread.

Writing different files in different threads to a zip file (each thread handling a unique file in the zip) can be tricky as well. For example: AES encryption requires a unique number (as part of salt calculation) for each file in the zip. And another example: if a zip file is being created and multiple number of files being added (with compression), then the second thread, which will start writing the second file to the zip should know exactly at which location in the zip file to start writing, and this cannot be determined until the first thread is done writing.

Some compression algorithms, like LZMA/LZMA2, support multithreading. Unfortunately, these compression methods are not supported by Zip4j at the moment.

Full text of their response (in case the post gets removed).

Laurentiu L.
  • 6,566
  • 1
  • 34
  • 60
  • The response is specifically whether the compression supports multithreading. It has very little to do with the OP's code, which apparently attempts to do some horrible voodoo multithreading determined by filenames and other such things. – Kayaman Jun 16 '15 at 12:30
  • @Kayaman I see. The question which the quoted answer answered was: can multiple threads read and write to the zip file at the same time without blowing up? So i thought because he calls new ZipFile(getZipFileName()); he will end up sometimes with the same file and thus end up in the said case. Correct me if i am wrong, but the quoted block does not specifically mean compression only , does it? – Laurentiu L. Jun 16 '15 at 12:40
  • Did zip4j using temporary files to construct zip files? I think the temporary file is shared by the process. – Mohammad Irfan Jun 16 '15 at 12:43
  • @LaurentiuL. The answer is describing that the library can't be made multithreaded (due to the algorithm). Although it looks like that's not what the asker was asking about. Reading and writing at the same time from/to a zip file sounds bizarre. – Kayaman Jun 16 '15 at 12:59