10

When unzipping files in Windows, I'll occasionally have problems with paths

  1. that are too long for Windows (but okay in the original OS that created the file).
  2. that are "duplicate" due to case-insensitivity

Using DotNetZip, the ZipFile.Read(path) call will crap out whenever reading zip files with one of these problems. Which means I can't even try filtering it out.

using (ZipFile zip = ZipFile.Read(path))
{
    ...
}

What is the best way to handle reading those sort of files?

Updated:

Example zip from here: https://github.com/MonoReports/MonoReports/zipball/master

Duplicates: https://github.com/MonoReports/MonoReports/tree/master/src/MonoReports.Model/DataSourceType.cs https://github.com/MonoReports/MonoReports/tree/master/src/MonoReports.Model/DatasourceType.cs

Here is more detail on the exception:

Ionic.Zip.ZipException: Cannot read that as a ZipFile
---> System.ArgumentException: An > item with the same key has already been added.
at System.ThrowHelper.ThrowArgumentException(ExceptionResource resource)
at System.Collections.Generic.Dictionary2.Insert(TKey key, TValue value, Boolean add)
at System.Collections.Generic.Dictionary
2.Add(TKey key, TValue value)
at Ionic.Zip.ZipFile.ReadCentralDirectory(ZipFile zf)
at Ionic.Zip.ZipFile.ReadIntoInstance(ZipFile zf)

Resolution:

Based on @Cheeso's suggestion, I can read everything from the stream, those avoiding duplicates, and path issues:

//using (ZipFile zip = ZipFile.Read(path))
using (ZipInputStream stream = new ZipInputStream(path))
{
    ZipEntry e;
    while( (e = stream.GetNextEntry()) != null )
    //foreach( ZipEntry e in zip)
    {
        if (e.FileName.ToLower().EndsWith(".cs") ||
            e.FileName.ToLower().EndsWith(".xaml"))
        {
            //var ms = new MemoryStream();
            //e.Extract(ms);
            var sr = new StreamReader(stream);
            {
                //ms.Position = 0;
                CodeFiles.Add(new CodeFile() { Content = sr.ReadToEnd(), FileName = e.FileName });
            }
        }
    }
}
gameweld
  • 1,319
  • 15
  • 21

2 Answers2

9

For the PathTooLongException problem, I found that you can't use DotNetZip. Instead, what I did was invoke the command-line version of 7-zip; that works wonders.

public static void Extract(string zipPath, string extractPath)
{
    try
    {
        ProcessStartInfo processStartInfo = new ProcessStartInfo
        {
            WindowStyle = ProcessWindowStyle.Hidden,
            FileName = Path.GetFullPath(@"7za.exe"),
            Arguments = "x \"" + zipPath + "\" -o\"" + extractPath + "\""
        };
        Process process = Process.Start(processStartInfo);
        process.WaitForExit();
        if (process.ExitCode != 0) 
        {
            Console.WriteLine("Error extracting {0}.", extractPath);
        }
    }
    catch (Exception e)
    {
        Console.WriteLine("Error extracting {0}: {1}", extractPath, e.Message);
        throw;
    }
}
Maria Ines Parnisari
  • 16,584
  • 9
  • 85
  • 130
  • 2
    This is actually the easiest solution I found. At the time of this comment, it might be helpful to know that the 7za.exe is in the 7-zip downloads with "7-zip extra: ...." in the description. (http://www.7-zip.org/download.html) – Dave Welling Feb 08 '15 at 17:39
  • The `Arguments` line is incorrect; it should be `Arguments = "x \"" + zipPath + "\" -o \"" + extractPath + "\""` (note no space between the **o** switch and `extractPath`. – sigil Dec 26 '16 at 23:41
  • Thanks @sigil, according to [this](http://superuser.com/questions/95902/7-zip-and-unzipping-from-command-line) you are correct, I updated my answer. – Maria Ines Parnisari Dec 26 '16 at 23:47
  • 1
    Good answer. Also to avoid for the process to stop with prompt about overriding existing you can add `-y` to the arguments – RVid Mar 16 '17 at 13:46
3

Read it with ZipInputStream.

The ZipFile class keeps a collection using the filename as the index. Duplicate filenames breaks that model.

But you can use the ZipInputStream to read in your ZipFile. There is no collection or index in that case.

Yousha Aleayoub
  • 4,532
  • 4
  • 53
  • 64
Cheeso
  • 189,189
  • 101
  • 473
  • 713