1

Maybe it's just me and I'm doing something really wrong or maybe this is the expected results of the class but I fell like something is wrong with it...

I have made the following test archive (there are also files in the folders but it shouldn't be relative to this question): enter image description here

I use the follow method to extract a single file (New Text Document.txt)...

#region SevenZipExtractor events
private void SevenZipExtractor_Extracting(object sender, ProgressEventArgs e)
{
    System.Diagnostics.Debug.WriteLine("SevenZipExtractor_Extracting -- " + e.PercentDone + "%");

    m_progress.UpdateProcessingStatus(e.PercentDone);
}

private void SevenZipExtractor_FileExtractionFinished(object sender, FileInfoEventArgs e)
{
    System.Diagnostics.Debug.WriteLine("SevenZipExtractor_FileExtractionFinished -- " + e.PercentDone + "% Filename:" + e.FileInfo.FileName);
}

private void SevenZipExtractor_FileExtractionStarted(object sender, FileInfoEventArgs e)
{
    System.Diagnostics.Debug.WriteLine("SevenZipExtractor_FileExtractionStarted -- " + e.PercentDone + "% Filename:" + e.FileInfo.FileName);
}
#endregion

private void DecompressThread()
{
    using (SevenZipExtractor extractor = new SevenZipExtractor(inStream))
    {
        extractor.Extracting += SevenZipExtractor_Extracting;
        extractor.FileExtractionStarted += SevenZipExtractor_FileExtractionStarted;
        extractor.FileExtractionFinished += SevenZipExtractor_FileExtractionFinished;

        using (FileStream file = new FileStream("C:\Sandbox\Z-Test\New Text Document.txt", FileMode.Create, FileAccess.Write))
        {
            extractor.ExtractFile(4, file);
        }

        extractor.Extracting -= SevenZipExtractor_Extracting;
        extractor.FileExtractionStarted -= SevenZipExtractor_FileExtractionStarted;
        extractor.FileExtractionFinished -= SevenZipExtractor_FileExtractionFinished;
    }
}

Then with the events Extracting, FileExtractionStarted and FileExtractionFinished I would EXPECT to get back the following results...

SevenZipExtractor_FileExtractionStarted -- 100% Filename:New Text Document.txt
SevenZipExtractor_Extracting -- 100%
SevenZipExtractor_FileExtractionFinished -- 100% Filename:New Text Document.txt

However, I get back the following results...

SevenZipExtractor_Extracting -- 100%
SevenZipExtractor_Extracting -- 100%
SevenZipExtractor_FileExtractionStarted -- 20% Filename:Test Folder 1
SevenZipExtractor_FileExtractionFinished -- 20% Filename:Test Folder 1
SevenZipExtractor_FileExtractionStarted -- 40% Filename:Test Folder 2
SevenZipExtractor_FileExtractionFinished -- 40% Filename:Test Folder 2
SevenZipExtractor_FileExtractionStarted -- 60% Filename:Microsoft - Visual Studio 6 MSDN Library.iso
SevenZipExtractor_Extracting -- 1%
SevenZipExtractor_Extracting -- 2%
SevenZipExtractor_Extracting -- 3%
SevenZipExtractor_Extracting -- 4%
SevenZipExtractor_Extracting -- 5%
SevenZipExtractor_Extracting -- 6%
SevenZipExtractor_Extracting -- 7%
SevenZipExtractor_Extracting -- 8%
SevenZipExtractor_Extracting -- 9%
SevenZipExtractor_Extracting -- 10%
SevenZipExtractor_Extracting -- 11%
SevenZipExtractor_Extracting -- 12%
SevenZipExtractor_Extracting -- 13%
SevenZipExtractor_Extracting -- 14%
SevenZipExtractor_Extracting -- 15%
SevenZipExtractor_Extracting -- 16%
SevenZipExtractor_Extracting -- 17%
SevenZipExtractor_Extracting -- 18%
SevenZipExtractor_Extracting -- 19%
SevenZipExtractor_Extracting -- 20%
SevenZipExtractor_Extracting -- 21%
SevenZipExtractor_Extracting -- 22%
SevenZipExtractor_Extracting -- 23%
SevenZipExtractor_Extracting -- 24%
SevenZipExtractor_Extracting -- 25%
SevenZipExtractor_Extracting -- 26%
SevenZipExtractor_Extracting -- 27%
SevenZipExtractor_Extracting -- 28%
SevenZipExtractor_Extracting -- 29%
SevenZipExtractor_Extracting -- 30%
SevenZipExtractor_Extracting -- 31%
SevenZipExtractor_Extracting -- 32%
SevenZipExtractor_Extracting -- 33%
SevenZipExtractor_Extracting -- 34%
SevenZipExtractor_Extracting -- 35%
SevenZipExtractor_Extracting -- 36%
SevenZipExtractor_Extracting -- 37%
SevenZipExtractor_Extracting -- 38%
SevenZipExtractor_Extracting -- 39%
SevenZipExtractor_Extracting -- 40%
SevenZipExtractor_Extracting -- 41%
SevenZipExtractor_Extracting -- 42%
SevenZipExtractor_Extracting -- 43%
SevenZipExtractor_Extracting -- 44%
SevenZipExtractor_Extracting -- 45%
SevenZipExtractor_Extracting -- 46%
SevenZipExtractor_Extracting -- 47%
SevenZipExtractor_Extracting -- 48%
SevenZipExtractor_Extracting -- 49%
SevenZipExtractor_Extracting -- 50%
SevenZipExtractor_Extracting -- 51%
SevenZipExtractor_Extracting -- 52%
SevenZipExtractor_Extracting -- 53%
SevenZipExtractor_Extracting -- 54%
SevenZipExtractor_Extracting -- 55%
SevenZipExtractor_Extracting -- 56%
SevenZipExtractor_Extracting -- 57%
SevenZipExtractor_Extracting -- 58%
SevenZipExtractor_Extracting -- 59%
SevenZipExtractor_Extracting -- 60%
SevenZipExtractor_Extracting -- 61%
SevenZipExtractor_Extracting -- 62%
SevenZipExtractor_Extracting -- 63%
SevenZipExtractor_Extracting -- 64%
SevenZipExtractor_Extracting -- 65%
SevenZipExtractor_Extracting -- 66%
SevenZipExtractor_Extracting -- 67%
SevenZipExtractor_Extracting -- 68%
SevenZipExtractor_Extracting -- 69%
SevenZipExtractor_Extracting -- 70%
SevenZipExtractor_Extracting -- 71%
SevenZipExtractor_Extracting -- 72%
SevenZipExtractor_Extracting -- 73%
SevenZipExtractor_Extracting -- 74%
SevenZipExtractor_Extracting -- 75%
SevenZipExtractor_Extracting -- 76%
SevenZipExtractor_Extracting -- 77%
SevenZipExtractor_Extracting -- 78%
SevenZipExtractor_Extracting -- 79%
SevenZipExtractor_Extracting -- 80%
SevenZipExtractor_Extracting -- 81%
SevenZipExtractor_Extracting -- 82%
SevenZipExtractor_Extracting -- 83%
SevenZipExtractor_Extracting -- 84%
SevenZipExtractor_Extracting -- 85%
SevenZipExtractor_Extracting -- 86%
SevenZipExtractor_Extracting -- 87%
SevenZipExtractor_Extracting -- 88%
SevenZipExtractor_Extracting -- 89%
SevenZipExtractor_Extracting -- 90%
SevenZipExtractor_Extracting -- 91%
SevenZipExtractor_Extracting -- 92%
SevenZipExtractor_Extracting -- 93%
SevenZipExtractor_Extracting -- 94%
SevenZipExtractor_Extracting -- 95%
SevenZipExtractor_Extracting -- 96%
SevenZipExtractor_Extracting -- 97%
SevenZipExtractor_Extracting -- 98%
SevenZipExtractor_Extracting -- 99%
SevenZipExtractor_FileExtractionFinished -- 60% Filename:Microsoft - Visual Studio 6 MSDN Library.iso
SevenZipExtractor_FileExtractionStarted -- 80% Filename:New Microsoft Excel Worksheet.xlsx
SevenZipExtractor_FileExtractionFinished -- 80% Filename:New Microsoft Excel Worksheet.xlsx
SevenZipExtractor_FileExtractionStarted -- 100% Filename:New Text Document.txt
SevenZipExtractor_Extracting -- 100%
SevenZipExtractor_FileExtractionFinished -- 100% Filename:New Text Document.txt

It seems that even though I'm trying to extract a single file it is processing all the files up to that point. When I try to use this on a larger scale (extracting a whole file rather than just 1 single file) I'll have an archive with one big file in the root and a bunch of small files folders and see a huge impact when it tries to process each small file (it takes just as long to extract each small file as it does the large file at the root of the archive.

Is there some type of expectation for the user to set a seek point in a memory stream or something? How can I make it not take so long to extract a small text file?

Arvo Bowen
  • 4,524
  • 6
  • 51
  • 109
  • @user2864740 sorry, meant no disrespect. I was just wondering if there was a big or was it intended for a reason I didn't know. Question title has been updated. – Arvo Bowen Feb 14 '20 at 20:40
  • @ArvoBowen Suggested reading, which I think explains the first comment: https://blog.codinghorror.com/the-first-rule-of-programming-its-always-your-fault/ maybe? –  Feb 14 '20 at 20:51
  • @Amy honestly, I didn't see my title as being a negative comment. It was just a question. Is there a bug when trying to extract one file or is it supposed to be that way. If the answer is the latter than it was my bad implementation and I was seeking help to use the tool the right way. I was never taking the stance "it's broken and they need to fix it" (aka "select is broken"). I was just looking for guidance that's all. I guess it just came off wrong with the title I used. But no worries, I changed it as requested. – Arvo Bowen Feb 14 '20 at 21:05
  • @ArvoBowen I understand, I was just trying to add some visibility into why that comment might have been written. SO does get a lot of questions like "this is my code, is there a bug in X?" It's a common reaction. I wouldn't take it personally. Like you said, no worries –  Feb 14 '20 at 21:10
  • @ArvoBowen Could you share your implementations of the event handlers? I would like to try this on my machine. –  Feb 14 '20 at 21:15
  • @Amy, code has been updated with them included. Thanks. – Arvo Bowen Feb 14 '20 at 21:23

2 Answers2

1

I have submitted an issue to the GitHub repo. At this point I believe it's either a bug that has not been addressed in the past or stepping through extracting a single file at a time was not the intended method for extracting an entire archive. In my example in the question I was targeting trying to extract a single file from an archive without the class processing the entire file (even though it only extracts the single file in the end).

In the grand scheme I was trying to extract an entire archive processing one file at a time (mainly because the messages I was getting from the referenced events where not giving me dependable results such as ONLY processing the one file at a time) manually. This might not be an intended method of getting the archive extracted (maybe just another route to end up with the same results but never used so never complained about).

I ended up trying to extract the entire archive using the following method with success. Getting very dependable messages back from the events describe above. I also added a little more to this example (then what was in my question) to make it completely working code. I mistakenly left out the inStream reference in the question.

#region SevenZipExtractor events
private void SevenZip_Processing(object sender, ProgressEventArgs e)
{
    System.Diagnostics.Debug.WriteLine("SevenZip_Processing -- " + e.PercentDone + "%");

    m_progress.UpdateProcessingStatus(e.PercentDone);
}

private void SevenZipExtractor_FileExtractionFinished(object sender, FileInfoEventArgs e)
{
    System.Diagnostics.Debug.WriteLine("SevenZipExtractor_FileExtractionFinished -- " + e.PercentDone + "% Filename:" + e.FileInfo.FileName);
}

private void SevenZipExtractor_FileExtractionStarted(object sender, FileInfoEventArgs e)
{
    System.Diagnostics.Debug.WriteLine("SevenZipExtractor_FileExtractionStarted -- " + e.PercentDone + "% Filename:" + e.FileInfo.FileName);
}
#endregion

private void DecompressThread(string archiveFilePath)
{
    byte[] fileInBytes = File.ReadAllBytes(archiveFilePath);

    using (MemoryStream inStream = new MemoryStream(fileInBytes))
    {
        using (SevenZipExtractor extractor = new SevenZipExtractor(inStream))
        {
            extractor.Extracting += SevenZip_Processing;
            extractor.FileExtractionStarted += SevenZipExtractor_FileExtractionStarted;
            extractor.FileExtractionFinished += SevenZipExtractor_FileExtractionFinished;

            extractor.ExtractArchive("C:\Sandbox\Z-Test");

            extractor.Extracting -= SevenZip_Processing;
            extractor.FileExtractionStarted -= SevenZipExtractor_FileExtractionStarted;
            extractor.FileExtractionFinished -= SevenZipExtractor_FileExtractionFinished;
        }
    }
}

And the results was...

SevenZipExtractor_FileExtractionStarted -- 11% Filename:Test Folder 1
SevenZipExtractor_FileExtractionFinished -- 11% Filename:Test Folder 1
SevenZipExtractor_FileExtractionStarted -- 22% Filename:Test Folder 2
SevenZipExtractor_FileExtractionFinished -- 22% Filename:Test Folder 2
SevenZipExtractor_FileExtractionStarted -- 33% Filename:Microsoft - Visual Studio 6 MSDN Library.iso
SevenZip_Processing -- 20%
SevenZip_Processing -- 40%
SevenZip_Processing -- 60%
SevenZip_Processing -- 80%
SevenZipExtractor_FileExtractionFinished -- 33% Filename:Microsoft - Visual Studio 6 MSDN Library.iso
SevenZipExtractor_FileExtractionStarted -- 44% Filename:New Microsoft Excel Worksheet.xlsx
SevenZipExtractor_FileExtractionFinished -- 44% Filename:New Microsoft Excel Worksheet.xlsx
SevenZipExtractor_FileExtractionStarted -- 56% Filename:New Text Document.txt
SevenZipExtractor_FileExtractionFinished -- 56% Filename:New Text Document.txt
SevenZipExtractor_FileExtractionStarted -- 67% Filename:Test Folder 1\New Text Document In TF1 - Copy.txt
SevenZipExtractor_FileExtractionFinished -- 67% Filename:Test Folder 1\New Text Document In TF1 - Copy.txt
SevenZipExtractor_FileExtractionStarted -- 78% Filename:Test Folder 1\New Text Document In TF1.txt
SevenZipExtractor_FileExtractionFinished -- 78% Filename:Test Folder 1\New Text Document In TF1.txt
SevenZipExtractor_FileExtractionStarted -- 89% Filename:Test Folder 2\New Text Document In TF2 - Copy.txt
SevenZipExtractor_FileExtractionFinished -- 89% Filename:Test Folder 2\New Text Document In TF2 - Copy.txt
SevenZipExtractor_FileExtractionStarted -- 100% Filename:Test Folder 2\New Text Document In TF2.txt
SevenZip_Processing -- 100%
SevenZipExtractor_FileExtractionFinished -- 100% Filename:Test Folder 2\New Text Document In TF2.txt
Arvo Bowen
  • 4,524
  • 6
  • 51
  • 109
0

I'm the author of another SevenZipSharp fork, and finally got around to look at your issue. The reason you're getting so many file extraction events when just trying to extract one file is due to the fact that 7z archives are created using solid compression.

When you try to extract a single file from a solid archive, the decompression starts at the beginning of the file and goes through it until it finds the file you were looking for. Your archive is the worst-case scenario, where the file you're looking for is at the end of the archive (your results show "New Text Document.txt" being extracted last).

The solution to your problem depends on what you need to achieve in the end, and what power you have over the creation of the archive. If you need to extract just the one file, and can change the archive format, I'd aim for some format that does not use solid compression (eg. zip). If you're forced to use the 7z file, and need to use several of the files, the best course would probably be to extract them all to a temporary directory and work on them there.

Squid-Box
  • 1
  • 1