1

How to obtain a syntactical tree representation of a Dyalog APL function (or an expression)?

I'd like to check whether a certain user-defined sub-function is called with certain number of elements in its right argument, explicitly specified as a vector in APL source code, i.e. for fooNction x(y+1)3, where fooNction is a used-defines function, and x and y are variables, I'd like to know that there are 3 components in its right argument.

I see only ⎕NR (Nested Represenation), which returns a function source as a vector of vectors, but I need a complete syntactical parsing result.

Update:

I'm interested in a number of arguments specified as a literal vector of anything, not returned as a result of some expression specified as an argument. ⎕NC is available for all called user-defined functions and operators (but not for all variables or namespaces).

Olexa
  • 577
  • 2
  • 16

2 Answers2

0

I fear it's impossible to come up with a perfect solution for this: if you look at the line A←B C D - what is it? Are we stranding the 3 variables B C D or are we calling function B with argument (C D) or perhaps calling function C with B and D as arguments? And while B might be a function now, it is still possible that the code in the lines just before this statement ⎕EXes B and declares a local variable B etc. And if you have no knowledge about B, C and D - this could could mean anything... Now...if you are just looking for occurrences of fooNction you could gather those calls and inspect'em - but what if you see an argument H? It could be a niladic function that returns a vector of 2 or 3 or more elements...

The dynamic nature of APL makes such static analysis...,eh, difficult or even impossible.

Do it anyway - DIY

You replied "It could be a niladic function that returns a vector of 2 or 3 or more elements... I consider it as a single argument then. This is enough for my task." - any you're not alone with that! Many APLers have had similar wishes regarding their code - and like you, they'd been happy to accept limitations in the analysis. But the limitations you accept depend on the questions you ask - and probably that's why there hasn't been enough energy to come up with a "community effort" for it. And, ofc, there's the idea that "it's only a few lines of code", so why bother - but fortunately there also is a trend towards building and using libraries.

Just to throw in a few ideas how you could deal with analysing calls of fooNction:

∇ test function
nr←⎕NR function
matches← ('fooNction(.*)(⋄|$)$'⎕s'\1')nr
:for m :in matches
    ⍝ do something with m...
:endfor
∇

This will find all calls of fooNtion (results will just contain the right argument ), so you could then either visually inspect those calls or try some string analysis do extract and count distinct entities.

Update:

I shared your question with a famous (but banned AI) tool. The answer is hilarious - I wish I could answer so inaccurately with such certainty myself! gpt

MBaas
  • 7,248
  • 6
  • 44
  • 61
  • > but what if you see an argument H? It could be a niladic function that returns a vector of 2 or 3 or more elements... I consider it as a single argument then. This is enough for my task. – Olexa Mar 10 '23 at 11:14
  • I added some thoughts to my reply ;) – MBaas Mar 11 '23 at 13:54
0

A general problem with parsing APL functions is that the same APL statement can result in different parse trees and without further knowledge about the symbols in the expression you cannot parse it.

For example:

The statement A B C could have any of the following valid parse trees (and a few more that result in a SYNTAX ERROR at runtime):

  1. (A) (B) (C) ⍝ values A, B, and C
  2. (A(B(C))) ⍝ monadic A, B and value C
  3. (A) (B(C)) ⍝ values A, C, monadic B
  4. (B(A, C)) ⍝ values A, C, dyadic B

This ambiguity makes it impossible to parse A B C and the same A B C can even differ between different calls of the same function that contains statement A B C. According to the APL standard the binding of A, B, and C happens right before the statement is executed and (unfortunately) NOT at the point in time when the function containing the statement is defined.

Jürgen
  • 31
  • 2
  • I could have access to `⎕NC` for all functions that could be used. Dynamically defined functions are not considered. Majority of variables are defined as local ones in a function's header, but there could be global ones as well. Let's say, everything that is not a function (`3=⎕NC`) could be considered a variable. – Olexa Mar 10 '23 at 11:21