3

Typically scan, both left and right variants, are O(n) in both space and time. However it seems that APL's \ operator is like scanl but seems to behave differently in that it is right associative and runs over the array each time, making it O(n^2).

For example,

nums ← 10?10  ⍝ 1 7 4 5 10 3 9 6 2 8
⌈\nums        ⍝ 1 7 7 7 10 10 10 10 10 10

gives me the correct behavior but by right associativity is equivalent to

(1 f (7 f (4 f (5 f (10 f (3 f (9 f (6 f (2 f 8)))))))))     ⍝ where f ← (⊣,⌈)

so the last operation is 1 f (7 7 7 10 10 10 10 10 10)

Isn't this inefficient, what is the actual big O complexity here and/or is there some idiomatic optimization?

mazin
  • 395
  • 2
  • 7
  • 2
    yeah apl scan sucks but you can roll your own with https://aplcart.info/?q=scanl# – rak1507 Dec 08 '21 at 10:07
  • not only that but it seems to give wrong answers when order matters https://stackoverflow.com/questions/70273199/what-is-the-correct-approach-to-efficiently-perform-a-scanl-in-apl – mazin Dec 08 '21 at 10:10
  • it's not 'wrong' it's just not the regular order – rak1507 Dec 08 '21 at 10:11
  • Well if doesn't behave like a scanl should I would call that wrong. I guess you could argue that it's not technically a scanl, but it's confusing coming from Haskell etc to call it scan – mazin Dec 08 '21 at 10:16
  • yeah, it's definitely confusing, and it probably is 'wrong' but it's by choice – rak1507 Dec 08 '21 at 10:17

1 Answers1

1

You are right in your algorithmic description Scan, and in the general case, it is indeed O(N²). However, by far the most common uses of it is with a known set of scalar primitives (including +, , , , , <, , ). These are recognised by the interpreter, which then uses special code O(n) (or less, as they might leave early) code.

We can easily demonstrate this by comparing the performance of + with the functionally identical +∘⊢ (plus, where the right argument is pre-processed by the identity function):

      'cmpx'⎕CY'dfns'
      a←?2000⍴127
      cmpx'+∘⊢\a'
1.8E¯1
      a←?4000⍴127
      cmpx'+∘⊢\a'
7.6E¯1

We can see that +∘⊢\ took about 4 times as long (0.2 s → 0.8 s) when we doubled the number of small integers from 2000 to 4000. Whereas:

      a←?2000⍴127
      cmpx'+\a'
1.5E¯6
      a←?4000⍴127
      cmpx'+\a'
2.9E¯6

+\ only doubles (15 ms → 29 ms) when going from 2000 to 4000 small integers. Also note the extreme performance difference between the optimised case and the non-optimised case.

Scan gets its order of evaluation from Reduce. Iverson explains in Conventions Governing Order of Evaluation:

  1. In the definition
       F/x ≡ x1 F x2 F x3 F ... F x⍴x
    the right-to-left convention leads to a more useful definition for nonassociative functions F than does the left-to-right convention. For example, -/x denotes the alternating sum of the components of x , whereas in a left-to-right convention it would denote the first component minus the sum of the remaining components. Thus if d is the vector of decimal digits representing the number n , then the value of the expression 0=9|+/d determines the divisibility of n by 9 ; in the right-to-left convention, the similar expression 0=11|-/d determines divisibility by 11 .
Adám
  • 6,573
  • 20
  • 37
  • I have a follow up question: https://stackoverflow.com/questions/70273199/what-is-the-correct-approach-to-efficiently-perform-a-scanl-in-apl – mazin Dec 08 '21 at 10:09