0

Given an enum TEnum with 33 items (Ceil (33 / 8) = 5 bytes), and a TEnumSet = Set of TEnum, the SizeOf (TEnumSet) gives a different result when running in 32 vs. 64-bit Windows:

  • 32 bit: 5 bytes as per the calculation above
  • 64 bit: 8 bytes

When increasing the number of elements in the enum the size will vary to, say, 6 bytes in 32-bit, while in 64-bit, it remains 8 bytes. As if the memory alignment in 64-bit is rounding up the size to the nearest multiple of XX? (not 8, smaller enums do yield a set size of 2, or 4). And a power of 2 is most likely not the case either?

In any case: this is causing a problem while reading a file to a packed record written as a buffer from a 32 bit program. Trying to read the same file back into a 64 bit program, since the packed record sizes don't match (the record contains this mismatching set, among other things), reading fails.

I tried looking in the compiler options for some options related to memory alignment: there is an option for record memory alignment but it does not impact sets, and is already the same in both configurations.

Any explanation on why the set is taking more memory in 64-bit, and any potential solutions to be able to read the file into my packed record on a 64-bit platform?

Note that I have no control over the writing of the file: it is written using a 32-bit program to which I don't have access (so altering the writing is not an option).

Khorkhe
  • 1,024
  • 1
  • 11
  • 26
  • 2
    Regarding the `Set` size difference in 64bit, see [Enumeration set size in x64](https://stackoverflow.com/questions/30336620/). A workaround for the reading issue would be to read the `Set` bytes from the file into a `Byte[5]` array first, then manually use bitwise operations to assign enum values to your `Set` variable. – Remy Lebeau Aug 29 '21 at 20:19

2 Answers2

2

Here is my test program:

{$APPTYPE CONSOLE}

type
  TEnumSet16 = set of 0..16-1;
  TEnumSet17 = set of 0..17-1;
  TEnumSet24 = set of 0..24-1;
  TEnumSet25 = set of 0..25-1;
  TEnumSet32 = set of 0..32-1;
  TEnumSet33 = set of 0..33-1;
  TEnumSet64 = set of 0..64-1;
  TEnumSet65 = set of 0..65-1;

begin
  Writeln(16, ':', SizeOf(TEnumSet16));
  Writeln(17, ':', SizeOf(TEnumSet17));
  Writeln(24, ':', SizeOf(TEnumSet24));
  Writeln(25, ':', SizeOf(TEnumSet25));
  Writeln(32, ':', SizeOf(TEnumSet32));
  Writeln(33, ':', SizeOf(TEnumSet33));
  Writeln(64, ':', SizeOf(TEnumSet64));
  Writeln(65, ':', SizeOf(TEnumSet65));
end.

And the output (I am using XE7 but I expect that it is the same in all versions):

32 bit 64 bit
16:2 16:2
17:4 17:4
24:4 24:4
25:4 25:4
32:4 32:4
33:5 33:8
64:8 64:8
65:9 65:9

Leaving aside the 32 vs 64 but difference, notice that the 17 and 24 bit cases could theoretically fit in a 3 byte type, they are stored in a 4 byte type.

Why does the compiler choose to use a 4 byte type rather than a 3 byte type? It can only be that this allows for more efficient code. Operating on data that can be mapped directly onto CPU registers is more efficient than picking at the data byte by byte, or in this case by accessing two bytes in one operation, and then the third byte in another.

This then points to why anything between 33 and 64 bits is mapped to an 8 byte type under the 64 bit compiler. The 64 bit compiler has 64 bit registers, and the 32 bit compiler does not.

As for how to solve your problem, then I can see two main approaches:

  1. In your 64 bit program, read and write the record field by field. For the fields which are afflicted by this 32 vs 64 bit issue, you will have to introduce special code to read and write just the first 5 bytes of the field.
  2. Change your record definition to replace the set with array [0..4] of Byte, and then introduce a property that maps the set type onto that 5 byte array.
David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • accepted this answer as it provides the under-the-hood explanation in addition to the suggestion in the comments. – Khorkhe Aug 31 '21 at 08:58
0

Working with the memory size of a set leads to process errors sooner or later. This becomes particularly clear when working with subtypes.

program Project1;

{$APPTYPE CONSOLE}

{$R *.res}

uses
  System.SysUtils;

type
  TBoolSet=set of boolean;
  TByteSet=set of byte;
  TSubEnum1=5..10;
  TSubSet1=set of TSubEnum1;
  TSubEnum2=205..210;
  TSubSet2=set of TSubEnum2;
var
  i, j: integer;
  a, a1: TByteSet;
  b, b1: TSubSet1;
begin
  try
    writeln('SizeOf(TBoolSet): ', SizeOf(TBoolSet)); //1
    writeln('SizeOf(TByteSet): ', SizeOf(TByteSet)); //32
    writeln('SizeOf(TSubSet1): ', SizeOf(TSubSet1)); //2
    writeln('SizeOf(TSubSet2): ', SizeOf(TSubSet2)); //2
  
   //Assignments are allowed. 
   a := [6, 9];
   b := [6, 9];
   writeln('a = b ?: ', BoolToStr(a = b, true)); //true

   a1 := a + b; //OK
   b1 := a + b; //OL
   a  := [7, 200];
   b1 := a + b; //??? no exception, Value 200 was lost. !
   i  := 0;
   for j in b1 do
     i := succ(i);
   writeln('b1 Count: ', i);

  readln(i);
 except
    on E: Exception do
     Writeln(E.ClassName, ': ', E.Message);
 end;
end.
USauter
  • 295
  • 1
  • 9
  • 2
    `i := succ(i);` seems like a strange way to write `Inc(i)`. – Andreas Rejbrand Aug 30 '21 at 07:26
  • @AndreasRejbrand `succ/pred` old pascal school. Old but tried and tested. – USauter Aug 30 '21 at 09:09
  • Aren't `Inc` and `Dec` just as old school, tried and tested? – Andreas Rejbrand Aug 30 '21 at 09:13
  • ``inc`` & ``dec`` did not exist in my first textbook. – USauter Aug 30 '21 at 09:35
  • Using `Succ()` and `for x in` is quite inconsistent, because the latter wasn't in your first textbook either. Also `Readln()` can be parameterless, even in the past. – AmigoJack Aug 30 '21 at 13:12
  • "Working with the memory size of a set leads to process errors sooner or later. This becomes particularly clear when working with subtypes." How does this answer the question that was asked? – David Heffernan Aug 30 '21 at 13:14
  • @DavidHeffernan * You also need to think about program maintenance. Does this personal knowledge flow into the work of colleagues? What if a colleague works with subtypes? * Will it still work with the next Delphi release? For me the way looks like a temporary solution. – USauter Aug 31 '21 at 06:04
  • Certainly maintenance is an important issue. My point though is that your post doesn't answer the question asked. – David Heffernan Aug 31 '21 at 06:14
  • @DavidHeffernan It describes the mistakes your colleagues can make with this model. No more and no less. – USauter Aug 31 '21 at 07:33
  • Yes. But what about the question that was asked? That has to be answered, it's a precondition for an answer to be posted here. By all means elaborate, but the question does need to be answered. – David Heffernan Aug 31 '21 at 21:10
  • @DavidHeffernan If you look at it purely academically, then your statement is correct. Is the problem just academic? – USauter Sep 01 '21 at 07:23
  • 1
    I'm trying to pass on my experience and knowledge as a long-time user here on this site. Answering the question is essential. It's your choice whether or not you heed that advice. – David Heffernan Sep 01 '21 at 08:51