1

So I want to decompress some data stored in a file. The solution I am currently using looks roughly like this:

FileStream fileStream = new FileStream([File path or whatever])
BinaryReader brStream = new BinaryReader(fileStream)

brStream.BaseStream.Position = [Address of the data I want]
byte[] relevantBytes = brStream.readBytes([Length of the data I want])

MemoryStream memoryStream = new MemoryStream(relevantBytes)
GZipStream gZipStream = new GZipStream(memoryStream, CompressionMode.Decompress)

But the problem arises when the data I'm compressing/decompressing is large. I don't want all of that in memory. I would prefer to give GZipStream a FileStream with a starting index and length. How can I do that?

S41L0R
  • 19
  • 1
  • _"I would prefer to give GZipStream a FileStream with a starting index and length"_ -- the first part is trivial. Did you try it? As for decompressing only part of a stream, that's not possible; the gzip algorithm (and really, practically all compression algorithms) work only over the entire block of data. There's no way to pick an arbitrary byte offset and start decompressing from there. – Peter Duniho Jul 29 '21 at 22:03
  • If you are stuck with gzip, you can't do what you want. But if you are able to use a different format for compression, see duplicate for alternative approaches to creating the compressed data which could fit your usage patterns. – Peter Duniho Jul 29 '21 at 22:08
  • @PeterDuniho Decompressing only part of a stream is the same exact thing as giving GZipStream a FileStream with a starting index and length but with different wording? – S41L0R Jul 29 '21 at 22:15
  • @PeterDuniho To your second message, I don't get how using a different compression algorithm would help. My problem is with the GZipStream API, not the algorithm it runs. – S41L0R Jul 29 '21 at 22:16
  • Maybe what I need is some sort of FileStream class that's a portion of a file, I don't know. Just some way in general to keep the data from taking up too much memory at a time. – S41L0R Jul 29 '21 at 22:22
  • _"I don't get how using a different compression algorithm would help"_ -- a different _algorithm_ doesn't help (unless you come across one that is block oriented and so inherently allows you to treat individual subsets of the data independently). The point is that if you have the ability to _change the problem itself_, a _different_ problem can be solved. _"My problem is with the GZipStream API"_ -- the API offers no way to decompress a subset of an input stream, because the algorithm itself inherently precludes that possibility. – Peter Duniho Jul 29 '21 at 22:24
  • _"Just some way in general to keep the data from taking up too much memory at a time"_ -- if you are using a `FileStream`, then the only data in memory at one time is whatever _your_ code chooses to keep in memory at one time. With gzip, you have no choice but the decompress starting at the beginning. But there's no rule that says you must keep _all_ of the decompressed data in memory once read from `GZipStream`. – Peter Duniho Jul 29 '21 at 22:26
  • You could implement a class derived from Stream that emulates part of the stream as a virtual stream. See: [VirtualFileStream class](https://pastebin.com/JhsmWuds) `FileStream fileStream = new FileStream([File path or whatever]); VirtualFileStream virtualFileStream = new VirtualFileStream(fileStream, [Address of the data I want], [Length of the data I want]); GZipStream gZipStream = new GZipStream(virtualFileStream, CompressionMode.Decompress)` – Cosmin Rus Jul 29 '21 at 22:36
  • @PeterDuniho Ah, I think I see where the confusion is coming from. I've got a large file, and only *part* of it is compressed. I just want to decompress the part that's compressed. – S41L0R Jul 30 '21 at 00:32
  • Ah, okay. In that case, it's possible you can just use a `FileStream` and set the `Position` before giving it to `GZipStream`. Whether that works depends on how `GZipStream` decides it's done decoding, but there's a chance it doesn't care about the actual stream length. If it does, see the new duplicate target I provided above, which shows how to create a `Stream` wrapper that will expose just the subset you want. – Peter Duniho Jul 30 '21 at 00:38
  • @CosminRus Same to you, thanks – S41L0R Jul 30 '21 at 00:39

0 Answers0