4

I have an encrypted ODT (Open Document Text) file and I need to unzip it. ODT is a ZIP file. An encrypted ODT is a normal ZIP file, just some files inside the ZIP are encrypted.

Using ZipFile works okay in a test, but I cannot use ZipFile really because I have a stream in memory, I don't want to work with a file.

Therefore I use ZipInputStream. But using ZipInputStream.getNextEntry() throws the dreadful only DEFLATED entries can have EXT descriptor exception.

From what I can understand, it throws on the first encrypted file inside the ZIP package, for example on content.xml. Because OpenOffice has encrypted the xml file, it was probably no point compressing it and it was stored inside the ZIP package uncompressed.

But ZipInputStream seems to have a problem with it and I don't see a way around.

And yes, the encrypted ODT file was created by OpenOffice Writer 3.2.1. And yes, the stock ZipInputStream cannot even enumerate through entries in it.

Anything you can suggest?

romeok
  • 661
  • 1
  • 9
  • 13
  • Did you try to understand why does ZipFile works and mimic this code? – AlexR Dec 21 '10 at 11:04
  • ZipFile very quickly calls into native code to work with a file. I need to have that in a memory stream unfortunately. – romeok Jan 05 '11 at 10:04

3 Answers3

1

The problem has nothing to do with encryption, but with the fact that ZipInputStream does not expect (and does not know how to handle) an EXT descriptor when the associated data was not DEFLATED (i.e. was stored uncompressed, as-is). This may well be a deficiency ("bug") in ZipInputStream, but I am not familiar enough with the zip specs to know one way or another.

An inelegant, even downright ugly workaround is to persist the stream to a temporary file, and then process it as a ZipFile.

(I am the author of ODFind and the "Decrypting ODF Files" document mentioned above.)

1

You can have a look if it's possible with ODF Toolkit library

lujop
  • 13,504
  • 9
  • 62
  • 95
0

Have you stumbled upon what Ringlord did in ODFind to read encrypted ODF files? This ODF document (viewable as HTML here courtesy Google) claims there is simply no way to rely solely on the Java libraries to decrypt OpenOffice.org documents. However, the author explains how one can decrypt the content.xml payload of an ODF file with knowledge of the ODF Manifest, RFC 2989, the PBKDF2Engine in JBoss3 and some original discovery by the author.

Best wishes.

DISCLAIMER: I have no affiliation whatsoever with Ringlord despite every link above points to Ringlord content.

David J. Liszewski
  • 10,959
  • 6
  • 44
  • 57