0

Result in Test is NOT always initially ()

I have found Do I need to setLength a dynamic array on initialization?, however I do not fully understand that answer

More importantly, what is the best approach to "break" the Result/A connection (I need the loop)? Perhaps some way to force the compiler to "properly" initialize? Manually adding Result := nil as first line in Test?

function Test(var A: TArray<Integer>): TArray<Integer>;
begin
  SetLength(Result, 3); // Breakpoint here
  Result[0] := 2;
end;

procedure TForm1.Button3Click(Sender: TObject);
var
  A: TArray<Integer>; // Does not help whether A is local or global
  I: Integer;
begin
  for I := 1 to 3 do
    A := Test(A); // At Test breakpoint:
                  // * FIRST loop: Result is ()
                  // * NEXT loops: Result is (2, 0, 0)
                  //               modifying Result changes A (immediately)
  A := Test(A);   // Result is again ()
end;
hundreAd
  • 158
  • 2
  • 8
  • On your breakpoint the Result array is not yet initialized. Why you pass "A" as var argument into Test( if you do not alter it? In your case, doing A := Test(A); its nonsense. The contents of a local variable are undefined until a value is assigned to them. – Marcodor Apr 10 '23 at 14:59
  • Marcodor: this is just a MRE, don't expect it to make too much sense; regardless, from for loop #2 onward, Result IS initialized at the breakpoint, see comments in the code – hundreAd Apr 10 '23 at 15:01
  • No, it only seems as initialized. It simply reused the same memory address, with no waranty it will always happen. You should always initialize local variables. See: https://docwiki.embarcadero.com/RADStudio/Sydney/en/Variables_(Delphi) – Marcodor Apr 10 '23 at 15:06
  • Marcodor: not according to https://stackoverflow.com/questions/5314918/do-i-need-to-setlength-a-dynamic-array-on-initialization/5315254#5315254: "Dynamic arrays are managed types and so are always initialized to nil"; exception: "the compiler, as an optimization is electing not to re-initialize the implicit local variable inside the loop" – hundreAd Apr 10 '23 at 15:10
  • Yup, managed types are initialized, but not Result / function return values. – Marcodor Apr 10 '23 at 15:36
  • I don't understand your problem. At the first loop A is empty before Test execution and [2,0,0] after execution. `var A` causes `modifying Result changes A (immediately)` – MBo Apr 10 '23 at 15:36
  • MBo: certainly A would be assigned the result AFTER test, but it is NOT immediately obvious that changing Result at the breakpoint IMMEDIATELY affects it – hundreAd Apr 10 '23 at 16:02
  • Marcodor: to my understanding (which might be wrong, hence this question), managed Result types such as integers, dynamic arrays, etc. ARE "properly" initialized; this is an edge case when this does not happen due to being within a for loop (note that the last Test call after the loop behaves "as expected") – hundreAd Apr 10 '23 at 16:07
  • 2
    Result is never initialized. Integer is not a managed type. – Marcodor Apr 10 '23 at 16:42
  • This is intended behaviour, a performance optomization. There's some articles about it that I'll try to search out. – David Heffernan Apr 10 '23 at 16:46
  • https://stackoverflow.com/a/5315254/505088 – David Heffernan Apr 10 '23 at 16:49
  • I think we could probably close this question as a duplicate of the one I linked to – David Heffernan Apr 10 '23 at 16:49
  • David Heffernan: actually i referenced this in the original question. i'm still unclear whether this optimization can be turned off, or what is the best way to solve/avoid this issue – hundreAd Apr 10 '23 at 16:51

2 Answers2

1

The referenced question is about fields inside of a class and they are all zero-initialized and managed types are properly finalized during instance destruction.

Your code is about calling a function with a managed return type within the loop. A local variable of a managed type is initialized once - at the beginning of the routine. A managed return type under the hood is treated by the compiler as a var parameter. So after the first call, it passes what looks to be A to Test twice - as the A parameter and for the Result.

But your assessment that modifying Result also affects A (the parameter) is not correct which we can prove by changing the code a bit:

function Test(var A: TArray<Integer>; I: Integer): TArray<Integer>;
begin
  SetLength(Result, 3); // Breakpoint here
  Result[0] := I;
end;

procedure Main;
var
  A: TArray<Integer>;
  I: Integer;
begin
  for I := 1 to 3 do
    A := Test(A, I);
                    
  A := Test(A, 0);
end;

When you single step through Test you will see that changing Result[0] will not change A. That is because SetLength will create a copy because the compiler introduced a second variable it uses temporarily for passing Result and after the call to Test it assigns that to A (the local variable) - you can see that in the disassembly view which will look similar to this for the line in the loop (I use $O+ to make the code a little denser than it would be without optimization):

Project1.dpr.21: A := Test(A, I);
0041A3BD 8D4DF8           lea ecx,[ebp-$08]
0041A3C0 8D45FC           lea eax,
0041A3C3 8BD3             mov edx,ebx
0041A3C5 E8B2FFFFFF       call Test
0041A3CA 8B55F8           mov edx,[ebp-$08]
0041A3CD 8D45FC           lea eax,[ebp-$04]
0041A3D0 8B0DC8244000     mov ecx,[$004024c8]
0041A3D6 E855E7FEFF       call @DynArrayAsg
0041A3DB 43               inc ebx

Knowing the default calling convention is first three parameters in eax, edx, and ecx, we know eax is the A parameter, edx is I and ecx is Result (the aforementioned Result var parameter is always last). We see that it uses different locations on the stack ([ebp-$04] which is the A variable and [ebp-$08] which is the compiler introduced variable). And after the call we see that the compiler inserted an additional call to System._DynArrayAsg which then assigns the compiler introduced temp variable for Result to A.

Here is a screenshot of the second call to Test:

enter image description here

Stefan Glienke
  • 20,860
  • 2
  • 48
  • 102
  • your analysis is not correct: according to the debugger (and what IT says goes! ;) modifying ```Result``` at the breakpoint IMMEDIATELY modifies ```A``` -- starting with the 2nd loop; this, unsurprisingly, also happens with your changed code – hundreAd Apr 11 '23 at 12:35
  • @hundreAd, can you please read once again slowly the answer? – Marcodor Apr 11 '23 at 14:55
  • Stefan Glienke: cannot post screenshot in comments (afaik?), but regardless, your breakpoint is different: it is AFTER ```SetLength```, see what happens BEFORE it executes – hundreAd Apr 12 '23 at 13:46
  • Stefan Glienke: this may be a case where a MRE is too simple to represent the complexity of the actual code, i will have to investigate further to see if there is actually a case where A is "in-place" overwritten by changes to Result (the procedure is generating Result based on A and finally assigns the outcome to A, but A must remain unchanged during this process) – hundreAd Apr 12 '23 at 14:08
  • Stefan Glienke: try adding as first line in Test: ```if Length(Result)>0 then Result[0] := 66;``` – hundreAd Apr 12 '23 at 14:41
0

While I hesitate to call this For compiler optimization a bug, this is certainly unhelpful if modifying array elements directly:

function Test(var A: TArray<Integer>): TArray<Integer>;
begin
  if Length(Result) > 0 then // Breakpoint
    Result[1] := 66; // A modified!
  SetLength(Result, 3);
  Result[0] := Result[0] + 1; // A not modified
  Exit;
  A[9] := 666; // Force linker not to eliminate A
end;

After investigation, I conclude that functions that affect the entire array (e.g. SetLength, Copy or some other function that returns TArray<Integer>) will -- unsurprisingly -- "break" the Result/A identicality created by the For loop.

It would appear that the safest approach is (as per the answer linked to in the original post) to Result := nil; as first line in Test.

If there are no further suggestions, I will eventually accept this as the answer.

NOTE: As an added bonus, starting with Result := nil prevents the array from being copied by SetLength -- obvious, but for e.g. an array of 100000 being looped 100000 times this little modification effectuates a ~40% faster execution time

hundreAd
  • 158
  • 2
  • 8
  • I already explained in my answer that `SetLength` creates a copy - of course, it will modify `A` if you write to `Result` before the call to `SetLength` because dynamic arrays don't have copy-on-write like strings do. Also, this should be a modification to your question and not an answer. – Stefan Glienke Apr 12 '23 at 15:22
  • Stefan Glienke: not a modification, just a clarification -- you seem to have not changed values (at breakpoint) via the debugger. regardless, this is caused by the for loop not re-initializing Result, otherwise the code would have behaved as [i, at least] expected; the answer is intended to provide a way solve this, something your post did not – hundreAd Apr 12 '23 at 15:33
  • I never understood the source comment `modifying Result changes A (immediately)` as "change some value in Result via the *debugger*" - your question thus is lacking information that only you know. – Stefan Glienke Apr 12 '23 at 16:17
  • Stefan Glienke: i (apparently erroneously) believed i was clear: the value of Result[?] needs to be changed prior to the instruction at the breakpoint executing. whether you do it by code, via the debugger or by whatever means... i stand corrected and will attempt to be even more clear next time; as for the comment ```immediately``` -- it is not very obvious that changing Result at the breakpoint will change A as well (after the function ends Result will of course be assigned to A) – hundreAd Apr 12 '23 at 16:42