2

I need to create binary data file. It cannot be created in one pass, I need to serialize some data, then go back and write offsets in the header. File will comfortably fit in memory (a few megabytes). Can I use BinaryWriter and go back to write offsets using writer.Seek(x, SeekOrigin.Begin)? Or maybe writing to file (and then modyfing it) has any advantages? Or maybe there is no real difference?

abatishchev
  • 98,240
  • 88
  • 296
  • 433
tomash
  • 12,742
  • 15
  • 64
  • 81
  • 2
    How much seeking will you need to do? I suppose it would be faster if the underlying stream you use for the writers is a MemoryStream, rather than a FileStream. That way you reduce disk access, and can write the finished stream to disk in a single pass. – havardhu Aug 29 '11 at 22:36

2 Answers2

1

Rather than offsets into the file, you should create a packed structure to represent the header. Fill in the structure and write it at the beginning of the file. It will also be easier to read the structure in one shot.

Steve Wellens
  • 20,506
  • 2
  • 28
  • 69
  • I don't know values to write in the header before rest of the data is serialized (written to file/stream writer). Question still stands, if we change "write offsets" to "write filled header struct". – tomash Aug 29 '11 at 23:02
  • Write an empty structure to the file. Then write all the 'stuff' recording the offsets into the structure. Then rewrite the structure...(at offset zero of course). – Steve Wellens Aug 30 '11 at 00:53
0

I think I understand your problem. You're serializing objects from other parts of your program, and you don't know how large each chunk is until you ask to serialize it.. and you don't want to serialize all of them all at once because that might be a lot of memory.

So you want to leave some room at the front of your file, write your binary data in chunks meanwhile recording how big each chunk was, then go back and write header to indicate where each chunk starts and stops.

The solution you're asking seems reasonable, but I believe BinaryWriter will overwrite if you seek back to your header location, so you'll need to write a pad of empty bytes up front to leave yourself room to write the header - http://msdn.microsoft.com/en-us/library/system.io.binarywriter.seek.aspx#Y640

Now the problems, how big is your header going to be? How many empty bytes do you write? Probably going to depend on the number of objects you have to serialize. Sounds like now you have the same problem as before.

Instead, I would pack your data sequentially, as an example:

| --------- Chunk 1 ----------|--------- Chunk 2 -----------|
| length | name | ... | bytes | length | name | ... | bytes |

The length parameter encodes the total length of that chunk, next you have whatever properties you have per chunk, then whatever the raw bytes are for the actual object being serialized.

antiduh
  • 11,853
  • 4
  • 43
  • 66