4

I have a zip file which is included one file: "Indulás előtt.html" (it's a hungarian text)

But when I try unzip I got error in the getNextEntry row:

try {
    ZipInputStream zis = newZipInputStream(getResources().openRawResource(R.raw.ie));
    ZipEntry ze = null;
    while ((ze = zis.getNextEntry()) != null) {
        info.setText(info.getText() + "\nName: " + ze.getName());
    }
} catch (Exception e) {
    info.setText(info.getText() + "\nERROR: " + e.getMessage());
}

and the error message is: "Input at 5 does not match UTF8 specifitcion"

Later I tried in another mode:

ZipFile zipfile = new ZipFile(file);
for (Enumeration e = zipfile.entries(); e.hasMoreElements();) {
    ZipEntry entry = (ZipEntry) e.nextElement();
    String name = new String(entry.getName().getBytes("UTF-8"), "UTF-8");
    info.setText(info.getText() + "\nName: " + name);
}

but displayed this:

Image

What is the solution???

The text include this letters:

link#1:http://en.wikipedia.org/wiki/%C3%81

link#2:http://en.wikipedia.org/wiki/%C5%90#Hungarian

Robertoq
  • 373
  • 1
  • 5
  • 11

1 Answers1

0

The filename character-set of a zip file can be ambiguous. Java 7's zip implementation should be able to detect the UTF-8 flag (http://docs.oracle.com/javase/7/docs/api/java/util/zip/package-summary.html#lang_encoding) but this relies on the packaging application to have correctly encoded the filename and set the requisite UTF-8 flag.

I suspect that your zip file has been packaged incorrectly or not using UTF-8 filename. Trying passing the default Zip char-set: Cp437

E.g.

ZipInputStream zis = new ZipInputStream(getResources().openRawResource(R.raw.ie), Charset.forName("Cp437"));
Alastair McCormack
  • 26,573
  • 8
  • 77
  • 100