3

I have such file:

file of record
  Str: string[250];
  RecType: Cardinal;
end;

but after some time using of this file my customer found, that Str never be bigger than 100 chars and also he need additional fields.

In new version we have such file:

file of packed record
  Str: string[200];
  Reserved: array[1..47] of Byte;
  NewFiled: Cardinal;
  RecType: Cardinal;
end;

This record have the same size, in previous record between Str and RecType was one unused byte when aligned to 8 bytes.

Question: what happened, when this new file will be readed from old code? He need backward compatability.

Old code reading sample:

var
  FS: TFileStream;
  Rec: record
         Str: string[250];
         RecType: Cardinal;
       end;
...
// reading record by record from file:
FS.Read(Rec, SizeOf(Rec));
Alex Egorov
  • 907
  • 7
  • 26
  • You need to provide language tags with your questions. People who are not familiar with Pascal (or Delphi) have no idea what you're asking here, and people who are might miss your question without the tags. – Ken White Feb 15 '13 at 00:22
  • Can you post some of the code that reads it? – placeybordeaux Feb 15 '13 at 00:25
  • 1
    It seems to me it would be very easy to write a quick test application that writes some records in the new format and then tries to read them using the old format; it would answer the question almost immediately, and would give you a test you could use for future changes as well. – Ken White Feb 15 '13 at 00:37
  • I have test, and this is works, but I'm confused - this is possible, that Str field may be garbled? – Alex Egorov Feb 15 '13 at 00:44
  • Which version of Delphi are you using? – jachguate Feb 15 '13 at 00:51
  • I'm using Delphi 5, this is very old project – Alex Egorov Feb 15 '13 at 00:54
  • BTW, your records are not the same size; `SizeOf` says the old one is 256 bytes, and the new one is 252, according to Delphi 2007. – Ken White Feb 15 '13 at 01:00
  • This is my error, code typed manually, here should be [1..47] of course, excuse me. The question is not about sizes of records. – Alex Egorov Feb 15 '13 at 01:09
  • 1
    Sorry, but the question *is* about the size of the records, if you post totally different record sizes. :-) Then the answer becomes "No, obviously this won't work, because you're reading and writing totally different sized records.". Thanks for the quick edit, though. :-) – Ken White Feb 15 '13 at 01:16
  • Much more sensible would be to stop blitting binary records onto files with legacy IO. Then the question goes away. – David Heffernan Feb 15 '13 at 07:17
  • What format do you recommend with comparable speed reading? – Alex Egorov Feb 15 '13 at 10:26
  • A good SAX based XML/YAML/JSON parser should be at least comparable and likely faster. Because you don't need to store all the unused content. Your files are probably 5 times the size that they need to be. And then you are also in compat heaven and not tied to some inflexible fixed record binary format. – David Heffernan Feb 15 '13 at 11:33

1 Answers1

3

The old school pascal string use the first byte of the string (index 0) to store the length of the string.

Let's look at the memory of this record:

byte    0  1  2  3  4  5  6  7  8  9 10 11  12  13 ........ 243..246 247..250
value  10 65 66 67 68 69 70 71 72 73 74  0 200 130          NewField RecType

From byte 11 to 242, the memory can contain garbage, it is simply ignored by the program (never shown) as this takes the value 10 at the byte 0 as the length of the string, so the string becomes 'ABCDEFGHIJ'

This ensures the old program reading a file created with the most recent version will never see garbage at the end of the strings, since the view of that strings will be limited to the actual size of the string and that memory positions are just ignored.

You have to double check if the old program does not change the values stored in case it writes the records back to the file. I think it is also safe, but I'm just not sure and have no Delphi at hand to test.

jachguate
  • 16,976
  • 3
  • 57
  • 98
  • I think the same about string format and with old program customer will view data without editing – Alex Egorov Feb 15 '13 at 01:16
  • 1
    Nicely explained, but the question was about *old code* reading the *new records*. Since the old record defined the string as 50 bytes larger, it could indeed contain a string that somehow was missed that is longer than the 200 characters in the new record (if, for instance, there was an actual 250 byte string stored, or one that was padded with spaces for some reason). – Ken White Feb 15 '13 at 01:20
  • All records does not contain strings which longer than 100 chars – Alex Egorov Feb 15 '13 at 01:30
  • @Ken a _new record_ could not be longer than 200, so there's no way the old code mess with that. In any case, the problem is for the new code, reading _old records_, because a string theoretically could be longer than 200. – jachguate Feb 15 '13 at 16:45
  • You're thinking backward. If the *old code* reads beyond the 200 characters (because it thinks it's allowed 250), it could either read the wrong value written by the new code in the extra bytes or could write something there that it shouldn't in space the new code is now using, corrupting the new code. – Ken White Feb 15 '13 at 16:47
  • @Ken, my answer takes that into account. In fact, the old code will always read the 250 bytes as if it where a single string, but that's fine, because the char-count of the old-school pascal string is there, and the old code will ignore the bytes beyond that _dynamic_ length. In a _new_ record, the char-count of the strings will always be <= 200. – jachguate Feb 15 '13 at 16:51
  • Until the first time someone uses the **old code** to actually write > 200 bytes in a string to the record, and overwrites something the **new code** has put there instead (corrupting the new code values) or the **new code** comes in after and overwrites, corrupting the **old code** values (because it still has a length byte of > 200 at index 0). – Ken White Feb 15 '13 at 16:54
  • 1
    @Ken, yes, but the question is about if it is safe to _read_ the values, not to write them. – jachguate Feb 15 '13 at 17:00
  • Thank you all fo your help and comments – Alex Egorov Feb 15 '13 at 18:11
  • Argh! That's what I'm talking about. :-) It's OK, though. I didn't downvote, because I'm not saying you're wrong. I'm just trying to point out that you can't *assume* that it's safe to read from one side without considering what happens if it's written to from the other. :-) No matter. – Ken White Feb 15 '13 at 19:10