1

Excel's INDEX function shows strange behaviour when the object being indexed is a name introduced in a LET construct or a parameter of a LAMBDA. The behaviour is consistent on Windows and Mac.

Suppose cell B2 contains the formula

=SEQUENCE(50000)

Then the following formulas are all relatively performant:

=MAP(SEQUENCE(2000),LAMBDA(x,INDEX(B2#,x)))

=LAMBDA(MAP(SEQUENCE(2000),LAMBDA(x,INDEX(B2#,x))))()

=LET(mgen,LAMBDA(B2#),LAMBDA(MAP(SEQUENCE(2000),LAMBDA(x,INDEX(mgen(),x)))))()

The following formulas, however, are terribly slow

=LET(m,B2#,MAP(SEQUENCE(2000),LAMBDA(x,INDEX(m,x))))

=LET(m,B2#,LAMBDA(MAP(SEQUENCE(2000),LAMBDA(x,INDEX(m,x)))))()

=LAMBDA(m,MAP(SEQUENCE(2000),LAMBDA(x,INDEX(m,x))))(B2#)

The performance problem gets the worse the longer array B2 is. You can crash Excel with the latter 3 formulas when you replace 50000 by 500000 in B2, while the first three formulas still work perfectly fine.

Note that the length of B2 (the array indexed) should in theory not have any impact on performance, since INDEX is called the exact same number of times in all my examples.

To me, INDEX seems to have performance problems whenever the first argument does not directly refer to a worksheet range.

Yet if that is so -- how can I efficiently (in constant time) get the n-th element of a LET/LAMBDA-named array?

I cannot work around this by writing the indexed array into a cell, since in my case, the indexed array is the result of another lambda.

Edit for clarification: The only purpose of the MAP/SEQUENCE(2000) construct in my examples is to make 2000 separate calls to INDEX, in order for the performance differences to become visible. The construct is completely unrelated to the problem. The performance problems occur whenever I make a lot of INDEX calls with the first argument being a LET/LAMDA name.

Second edit: It seems, however, that some kind of loopy construct is needed to reproduce the problem. I have not been able to reproduce the problem with 2000 separate LET/INDEX formulas in 2000 cells.

ZygD
  • 22,092
  • 39
  • 79
  • 102
JohnB
  • 13,315
  • 4
  • 38
  • 65
  • I also posted this here: https://feedbackportal.microsoft.com/feedback/idea/da2f677a-3735-ed11-a81b-000d3a7dd504 – JohnB Sep 15 '22 at 20:47
  • Why not just do `=LET(m,B2#,INDEX(m,sequence(2000))` or just `=TAKE(B2#,2000)` – Scott Craner Sep 15 '22 at 20:51
  • I just tried to create a minimal example demonstrating the problem. The indices I actually need are not sequential. I do need to INDEX single elements because the index I need to access next depends on the value just read. – JohnB Sep 15 '22 at 20:58
  • A combination of DROP and TAKE seems to be slow as well :-( – JohnB Sep 15 '22 at 21:07
  • 2
    Then, this question boils down to, "Why did Microsoft not make their function more performant?" Which is a question we have been asking since they introduced the heavy calc formulas like SUMIFS and COUNTIFS. And the answer...No one know but Microsoft. Maybe I would suggest taking your real life formula and need to codereview to see if anyone can help make it more performant. – Scott Craner Sep 15 '22 at 21:08
  • Re "*To me, INDEX seems to have performance problems whenever the first argument does not directly refer to a worksheet range.*", Scott's suggestion of `=LET(m,B2#,INDEX(m,SEQUENCE(2000)))` would appear to contradict this statement, since it is as performant as your first three examples. – Jos Woolley Sep 16 '22 at 04:51
  • No, that is not a contradiction. I suspect that a call to `INDEX` is slow when the array being indexed into is a LET name, irrespective of how many values you request from the array in that one call. In the example you mention, `INDEX` is called only once. I suspect this call is indeed much slower (measured in microseconds) than `INDEX(B2#,SEQUENCE(2000))` would be, yet since it is only a single call, we do not notice the difference and do not care. If you have a thousand such calls, you do see the difference, though. – JohnB Sep 16 '22 at 05:34
  • It *is* a contradiction to your statement "*"To me, INDEX seems to have performance problems **whenever the first argument does not directly refer to a worksheet range**.*". Re your latest comment, am I to understand that you believe the `LET` function is forcing the `INDEX` construction to be called more than once? – Jos Woolley Sep 16 '22 at 06:28
  • No, I did not mean that the LET function forces the INDEX function to be called more often. I say that *if* we have a lot of calls to INDEX (for whatever reason), we run into performance problems when we use LET even when there would not be any performance issues without LET. – JohnB Sep 16 '22 at 06:41

0 Answers0