4

I am trying to read only the first line of a zipped csv file. I used below code but get the error "The magic number in GZIP header is not correct". Obviously it has to do with the fact that GZIP and ZIP are not identical formats but I do not seem to get it working even when using the DotNetZipLib library or SharpZip.

using (GZipStream gzipStream = new GZipStream(File.OpenRead(fileName), CompressionMode.Decompress))
            {
                using(StreamReader sr = new StreamReader(gzipStream))
                {
                    //Matt try something like this as a hint / starting point 
                    While(sr.Read())
                    {
                      row = sr.ReadLine();
                    }

                }
            }

Does any of you know how to handle standard zip files (not gzip) and to stream the content to a StreamReader object so that I can easily read the first line of the zipped text file? I do not look for a solution that completely decompresses the whole zip file before opening the text file. I look for a similar solution as above but one that can handle zip files. I also do not want to go the geeky route through byte arrays and having to reconstruct the first row from the array as it would require knowledge of the exact content of the first row (data types, delimiters,...).

Thanks

MethodMan
  • 18,625
  • 6
  • 34
  • 52
Matt
  • 7,004
  • 11
  • 71
  • 117
  • So the error you told us about is because GZip and Zip are not the same, which you knew. What happens when you use DotNetZipLib or SharpZip? – Chris Shain Feb 21 '12 at 17:26
  • Try googling "C#how to read compressed file using a StreamReader" there are TONS of examples out there Matt – MethodMan Feb 21 '12 at 17:27
  • GZip & Zip are not the same thing. You will definitely need to use something like DotNetZipLib or SharpZip. Can you post the code you attempted with those and maybe we can advise you ? – Eoin Campbell Feb 21 '12 at 17:28
  • I tried but dont see those libraries are able to stream to StreamReader. Maybe I am missing something. I cannot read the first line of a zip file without having to completely decompress the thing first. – Matt Feb 21 '12 at 17:29
  • I will do a google search for you and post a link hold fast here is a link to DotNetZip Library http://dotnetzip.codeplex.com/ – MethodMan Feb 21 '12 at 17:30
  • @DJ KRAZE, I looked everywhere, they all deal either gzip formats or decompress to bytearrays. I stated I look to stream to StreamReader so that I can easily use ReadLine() to extract the first line of the zipped csv file. – Matt Feb 21 '12 at 17:31
  • Eoin, I dont even know where to start re those libraries, I see they can decompress to byte Arrays but I dont see functionality to stream to a streamreader object. – Matt Feb 21 '12 at 17:33
  • Matt don't Panic download this library it's free http://dotnetzip.codeplex.com/ I am sure that there is documentation as well other wise you will have to uncompress then read which maks no sense if you can just get at the Items themselves within the zipcontainer.. does this make sense – MethodMan Feb 21 '12 at 17:36
  • makes sense, marked your answer as solution, thanks a lot. – Matt Feb 21 '12 at 17:53

2 Answers2

2

for example Matt here is something that you could do as well checkout this code sample This uses SharpZipLib Library

var zip = new ZipInputStream(File.OpenRead(@"C:\MyZips\myzip.zip"));
var filestream = new FileStream(@"C:\\MyZips\myzip.zip", FileMode.Open, FileAccess.Read);
ZipFile zipfile = new ZipFile(filestream);
ZipEntry item;
while ((item = zip.GetNextEntry()) != null)
{
     Console.WriteLine(item.Name);
     using (StreamReader s = new StreamReader(zipfile.GetInputStream(item)))
     {
      // stream with the file
          Console.WriteLine(s.ReadToEnd());
     }
 }
MethodMan
  • 18,625
  • 6
  • 34
  • 52
  • DJ KRAZE, it works, thanks a lot. I wasted 3 hours getting this to work as I could not see reference to using StreamReader in particular in combination with the library. Awesome, you saved me more headache. – Matt Feb 21 '12 at 17:58
  • Awesome.. I am glad that I was able to quickly contribute to the saving of ones life..lol any other issues feel free to reachout – MethodMan Feb 21 '12 at 18:31
-1

The above answer didnt work for me (it casted an error at runtime: nullreference for "item") so i modified the code a bit.(a text file called "text.txt" is zipped in a zip called "archive.zip") This one is in VB.NET and uses SHARPZIPLIB library(you must import it into VB and call it before public class mainform.

here is the code :

       Imports ICSharpCode.SharpZipLib.Zip

'now put the following code in a private sub ( i put it in private sub button_click)

       Dim zip As New ZipInputStream(File.OpenRead("c:\archive.zip")) 'location of the zip file
       Dim filestream As New FileStream("c:\archive.zip", FileMode.Open,FileAccess.Read)
        Dim zipfile As ZipFile = New ZipFile(filestream)

        Dim item As ICSharpCode.SharpZipLib.Zip.ZipEntry
        item = New ZipEntry("text.txt")

        While (Not (zip.GetNextEntry) Is Nothing)
            Console.WriteLine(item.Name)
            Dim s As StreamReader = New StreamReader(zipfile.GetInputStream(item))
            ' stream with the file
            MsgBox(s.Readline)

        End While
        end sub

When u run the code, The message box will popup with the text that is entered in the first line of a text file text.txt Hope this helps. Cheers!

user2993965
  • 39
  • 1
  • 1
  • 5
  • I specified C# not VB.Net (despite the ease to port). Also, my chosen answer worked for me. – Matt Jan 20 '14 at 01:31