3

I'm trying to understand ArrayFormula function, but the official documentation is very poor.

For example, I've got a google sheet with amounts and subtotals calculated with ArrayFormula. There are two formulas, and both produces some magic.

The first one uses SUMIF with two identical ranges inside and wrapped in ROW function. It looks like the first range is expanded by SUMIF itself, while the second one is expanded by ArrayFormula. How do I know which function will expand a range in different cases?

enter image description here

The second formula uses TRANSPOSE, and it seems this brings ArrayFormula to pass this ranges independetly. How do I know the order of data processing in such cases?

enter image description here

player0
  • 124,011
  • 12
  • 67
  • 124
raacer
  • 5,302
  • 3
  • 27
  • 46

2 Answers2

4

This is a fairly big question, but since the first part of the question refers to my answer to this question, I ought to attempt to answer it.

To put it very simply, I think you can reduce it to two cases:

(1) Functions that by their nature always return an array. Some of these are

  • Filter
  • Query
  • Sequence
  • Unique

These are automatically expanded and you don't need to use ArrayFormula with them.

(2) Functions that don't normally return an array but can be coerced into returning an array if you feed them a list of values where a single value is normally expected and you wrap them in ArrayFormula.

Scalar use of If statement:

=If(A3,D3,"")

Vector (arrray formula) use of If statement:

=ArrayFormula(if(A3:A,D3:D,""))

Scalar use of Sumif:

=Sumif(row(B3:B),"<="&B3,C3:C)

Note that although the first and third parameters of the Sumif are ranges, the second parameter is normally a single value. If you replace the second parameter with a range and wrap it in ArrayFormula, the function is re-evaluated for each value in the range, and therefore generates an array. This behaviour depends on the fact that the function itself must contain some code (invisible to the user) that tests whether an array is being passed where a single value would normally be expected, and handles it iteratively. This isn't always the case: Sumifs does not exhibit this behaviour because nobody coded it that way.

Vector use of Sumif:

=ArrayFormula(Sumif(row(B3:B),"<="&B3:B,C3:C))

Expansion of two 1d arrays into a 2d array

I don't know what the correct term is for the expansion of two 1d arrays into a 2d array. Here is a simple example (B1:D1 and A2:A4 are pre-filled and the array formula fills in B2:D4):

enter image description here

It works with other binary operators including < and >, but as far as I know there is no 3d equivalent - it is limited by the 2d nature of a sheet. Also works with functions that take two arguments (Pow, Mod etc.), so this would produce exactly the same result:

=ArrayFormula(ADD(B1:D1,A2:A4))
Tom Sharpe
  • 30,727
  • 5
  • 24
  • 37
  • Thank you for clear explanation. It would be great if you also explain why this formula `=ArrayFormula(TRANSPOSE(ROW(A:A)) < ROW(A:A))` returns matrix. Also being a programmer I have never seen a function that can manipulate on arguments of nested calls, but `ArrayFormula` does. Should I assume `ArrayFormula` evaluates the nested expression like it was a code string? I mean classic programming languages execute nested calls first. – raacer Jun 30 '20 at 22:43
  • I think @player0 has thrown some light on the first part of your comment, but I will have a look at it later. It's interesting to speculate how ArayFormula might actually be implemented so that it propagates through and affects a whole set of nested functions. Since Google Docs is written in Java, my guess is that it might cause the setting of a static class variable, which is tested by each of the functions in the calling sequence. – Tom Sharpe Jul 01 '20 at 09:01
  • @player0 has demonstrated how it works, but he did not explain why `=ArrayFormula(TRANSPOSE(ROW(A:A)) < ROW(A:A))` produces matrixes instead of arrays as we can expect. Does diferent data direction (row and column) force ArrayFormula to process these ranges independently? Is there a way to add third dimension, i.e. third independent range? – raacer Jul 01 '20 at 13:37
  • What about processing nested funcation arguments by ArrayFormula, I mean the sence and idea of formula language itself, not implementation on Java. There should be some rules of formula language processing. Scopes, operation precendence, etc. – raacer Jul 01 '20 at 13:48
  • Still haven't got round to answering the earlier question, but your later question has been out there for a while - maybe start by looking at https://webapps.stackexchange.com/questions/15727/complete-reference-documentation-for-google-spreadsheets-formulae Partlyexplained by fact that GS is based on Excel, which is reasonably well documented https://support.microsoft.com/en-gb/office/overview-of-formulas-in-excel-ecfdc708-9162-49e8-b993-c311f47ca173 – Tom Sharpe Jul 01 '20 at 19:51
2

there are functions which are supported by ARRAYFORMULA and then there are such which are not, eg. not every function can be converted into ArrayFormula variant. for example AND and OR are not supported under AF so you need to use 0/1 logic gates. another function that is not supported is simple SUM... if you want to process arrays with sum you can use MMULT function which is able to process all 4 states - sum, subtraction, multiplication and division.

in other words, MMULT is the "magic" function not AF. you can understand AF as something like "wrapper" that enables calculation over array/range

MMULT resources:

disadvantages:

  • MMULT is "heavy" function eg. more rows you have - the slower it gets
  • SUMIF the 3rd parameter of sumif always needs to be a valid/direct range
player0
  • 124,011
  • 12
  • 67
  • 124
  • Thanks for answer. I think I understand `MMULT`. What is not clear for me is why this expression returns matrix: `=ArrayFormula(TRANSPOSE(ROW(A:A)) < ROW(A:A))` – raacer Jun 30 '20 at 22:51
  • pls see: https://docs.google.com/spreadsheets/d/1ZWnFjl0ocDnutoltCYoUaoAiYJity2WE4PpTU9iUcuM/edit#gid=1203377883 – player0 Jun 30 '20 at 23:08
  • Yes, I know these formulas produce matixes, but why? – raacer Jul 01 '20 at 13:29