3

Julia manual states:

Every Julia program starts life as a string:

julia> prog = "1 + 1"
"1 + 1"

I can easily get the AST of the simple expression, or even a function with the help of quote / code_*, or using Meta.parse / Meta.show_sexpr if I have the expression in a string.

The question: Is there any way to get the whole AST of the codepiece, possibly including several atomic expressions? Like, read the source file and convert it to AST?

Aleksei Matiushkin
  • 119,336
  • 10
  • 100
  • 160
  • What exactly do you mean by "whole AST"? `:(1 + 1)` _is_ as whole as it gets. Do you mean to recursively expand `+` to a sub-AST? (If yes, that doesn't really work). – phipsgabler Nov 17 '19 at 09:22
  • @phipsgabler Yes, by “whole” I mean grab the random file from the internets and dump it’s AST. Could you point me to some ...uhmm resource giving a hint on _why_ it does not really work? – Aleksei Matiushkin Nov 17 '19 at 09:23
  • If you're talking about files and the internet, I think we are speaking about different things. What I meant is, you can't go from `1 + 1` to `Core.add_int(1, 1)` via ASTs, because ASTs exist before type inference, and only after type inference you know which method of `+` will actually be called. – phipsgabler Nov 17 '19 at 09:26
  • @phipsgabler That I understand. I am not going to do anything with the resulting AST. I just need to dump it, that’s it. I could not find any way to dump a complex Julia source, not the simple expression like `"1 + 1"`. – Aleksei Matiushkin Nov 17 '19 at 09:28
  • Ah, do you want to call something `parse` on a whole source file, containing more than one single expression? – phipsgabler Nov 17 '19 at 09:39
  • Exactly. File. or just a string having several expressions. – Aleksei Matiushkin Nov 17 '19 at 09:47
  • A surprisingly hard question :) There's [`jl-parse-file`](https://github.com/JuliaLang/julia/blob/5fe17cdcff4142e40c3797879c44ceadcb34a923/src/jlfrontend.scm#L78) in FemtoLisp, which you can call in `julia --lisp`, but I have no idea how to do it from inside Julia. Maybe write a variant of [`jl_parse_eval_all`](https://github.com/JuliaLang/julia/blob/1739ca0aea41f6295ffdd671f0ad7be05ad0711d/src/ast.c#L804) in C. – phipsgabler Nov 17 '19 at 09:52
  • I have tried `jl-parse-file`, it returns back _LISP source_, not what is mentioned as Julia AST. Since LISP is an AST on its own, I’d likely use it. Thanks! – Aleksei Matiushkin Nov 17 '19 at 09:58

2 Answers2

5

If you want to do this from Julia instead of FemtoLisp, you can do

function parse_file(path::AbstractString)
    code = read(path, String)
    Meta.parse("begin $code end")
end

This takes in a file path, reads it and parses it to a big expression that can be evaluated.

This comes from @NHDaly's answer, here: https://stackoverflow.com/a/54317201/751061

If you already have your file as a string and don’t want to have to read it again, you can instead do

parse_all(code::AbstractString) = Meta.parse("begin $code end")

It was pointed out on Slack by Nathan Daly and Taine Zhao that this code won't work for modules:

julia> eval(parse_all("module M x = 1 end"))
ERROR: syntax: "module" expression not at top level
Stacktrace:
 [1] top-level scope at REPL[50]:1
 [2] eval at ./boot.jl:331 [inlined]
 [3] eval(::Expr) at ./client.jl:449
 [4] |>(::Expr, ::typeof(eval)) at ./operators.jl:823
 [5] top-level scope at REPL[50]:1

This can be fixed as follows:

julia> eval_all(ex::Expr) = ex.head == :block ? for e in ex eval_all(e) end : eval(e);

julia> eval_all(ex::Expr) = ex.head == :block ? eval.(ex.args) : eval(e);

julia> eval_all(parse_all("module M x = 1 end"));

julia> M.x
1

Since the question asker is not convinced that the above code produces a tree, here is a graph representation of the output of parse_all, clearly showing a tree structure.

enter image description here

In case you're curious, those leaves labelled #= none:1 =# are line number nodes, indicating the line on which each following expression takes place.

As suggested in the comments, one can also apply Meta.show_sexpr to an Expr object to get a more "lispy" representation of the AST without all the pretty printing julia does by default:

julia> (Meta.show_sexpr ∘ Meta.parse)("begin x = 1\n y = 2\n z = √(x^2 + y^2)\n end")
(:block,
  :(#= none:1 =#),
  (:(=), :x, 1),
  :(#= none:2 =#),
  (:(=), :y, 2),
  :(#= none:3 =#),
  (:(=), :z, (:call, :√, (:call, :+, (:call, :^, :x, 2), (:call, :^, :y, 2))))
)
Mason
  • 2,981
  • 9
  • 29
  • It does not return AST though; it returns `quote $code end` (with annotations placed into `$code`.) – Aleksei Matiushkin Nov 17 '19 at 16:27
  • 1
    That is an Expr containing all the code from your file. That is julia’s AST. – Mason Nov 17 '19 at 16:29
  • “T” is AST stays for “Tree,” so **no**, this is not an AST. – Aleksei Matiushkin Nov 17 '19 at 16:30
  • 2
    In what way is that structure not a tree? – Mason Nov 17 '19 at 16:33
  • 1
    I have to support Mason here. `Meta.parse("begin\nimport LinearAlgebra\nx = 1\n f(x) = sin(x) + x\nend") |> Meta.show_sexpr` -- `Expr`s are clearly trees, they are just pretty printed. – phipsgabler Nov 17 '19 at 17:23
  • 1
    In fact, this is equivalent to what Lisp gives you: `(jl-parse-all "begin\nimport LinearAlgebra\nx = 1\n f(x) = sin(x) + x\nend" "dummy")`, except for the wrapper. – phipsgabler Nov 17 '19 at 17:26
  • 1
    Yeah, @AlekseiMatiushkin, julia's AST is a tree of `Expr`s and literals: `Front end ASTs consist almost entirely of Exprs and atoms (e.g. symbols, numbers).` https://docs.julialang.org/en/v1/devdocs/ast/#Surface-syntax-AST-1 – NHDaly Nov 17 '19 at 17:33
  • I have added a graphic using the TreeView package showing that the output of `parse_all` is a tree. – Mason Nov 17 '19 at 17:40
  • `Meta.parse |> Meta.show_sexpr` as suggested by @phipsgabler did the trick, thank you. It would be great if you’d update the answer to mention `Meta.show_sexpr`. Without it, the output I got in REPL looks as a code wrapped into `quote` block. – Aleksei Matiushkin Nov 17 '19 at 18:51
1

There's jl-parse-file in the FemtoLisp implementation of the Julia parser. You can call it from the Lisp REPL (julia --lisp), and it returns an S-expression for the whole file. Since Julia's Expr is not much different from Lisp S-expressions, that might be enough for you purposes.

I still wonder how one would access the result of this from within Julia. If I understand correctly, the Lisp functions are not exported from libjulia, so there's no direct way to just use a ccall. But maybe a variant of jl_parse_eval_all can be implemented.

phipsgabler
  • 20,535
  • 4
  • 40
  • 60
  • 1
    In my use-case I am not going to access the result _from within Julia_. I am lazily thinking about running the _Julia_ from within ErlangVM, and the most natural way is not transpiling the source, but map the AST. So my purpose would be to have _Julia source → AST_ write-only. I still keep in mind the idea of having this mapping both ways, so if you’d stumble upon the reasonable solution, please ping me back. Thanks again. – Aleksei Matiushkin Nov 17 '19 at 10:13