I wanna visit the nodes in an AST from a Java file and extract some information related to these nodes. But, I wanna pass by the AST guided by the lines in the source code file. I know there is information about the lines associated with each node, but the problem is that the default way to access the nodes is through specific visitors. So: 1. to avoid redundant visits to the nodes, 2. do not generates overhead while trying to enumerate all the possible node types (or visitors), and 3. to access the information in nodes in an ordered way, I need a kind of "Line Visitor", so that I can access information in AST nodes, following the lines in the source code file. Does someone know a standard way to do it with Eclipse JDT API or even a workaround?
-
You should probably post some code of how your processing currently looks like and which information you are missing. – lexicore Jul 04 '18 at 11:44
-
The missed information is exactly this mapping from lines to nodes (the order is important here because I know the inverse relation is already available in nodes API, but it is not exactly what I want as I highlighted in the description above). This is the reason why the available visitors and its traversal model in the AST don't fit very well into my requirements. I suppose I will need a workaround... – Victor Sobreira Jul 04 '18 at 20:06
-
**You should probably post some code** of how your processing currently looks like and which information you are missing. – lexicore Jul 04 '18 at 20:59
-
A similar discussion was started on [another question](https://stackoverflow.com/questions/51163199/in-eclipse-jdt-java-parser-is-it-possible-to-traverse-the-ast-node-by-node-with?noredirect=1#comment89343659_51163199)... So, please look there to avoid duplicated clarifications about code posting for this question, because the rationale is the same. – Victor Sobreira Jul 04 '18 at 21:38
1 Answers
I can't speak from direct knowledge of Eclipse ASTs. However, to the extent that these are traditional ASTs simply represented in Java, then pretty much the way you have to visit the tree nodes in the absence of any other help, is by walking the tree.
Of course, you can probably filter AST nodes by some type of file position information (line, column, ...) that Eclipse associates with such nodes, and simply filter for ASTs stamped with the line you want. Unless you really, really care about how long this takes (it is worst case linear in the file size, my experience with other systems suggest you get ~5-7 nodes per source line average), this should be good enough for your purposes.
If you wanted direct access to the tree nodes associated with a specific line number, my guess is you are out of luck. Obviously, you can build such a map yourself by walking the tree once and collecting all the nodes that have a specific line numbe; then you could have the access you want. [You really only need to associate the first AST of a line {leftmost in an inorder tree-walk) for this map to be usable] Again the tree walk to build this list is linear time and you'll only pay it once. FWIW, I've been building tools that process ASTs for ~~30 years and have not found this to be particularly useful.
If you insist, and you want to lower the cost of building this map, I'd look inside the parsing machinery and modify it to do this work. It manufactures all those AST nodes, and it knows the line number of the source being processed when it manufactures such a node. It should be easy to build the map as the AST nodes are generated. If your parser is any good, it is effectively linear time, and adding this work won't change the linearity.

- 93,541
- 22
- 172
- 341
-
According to your answer, I suppose I'm out of luck... :( Anyway, your suggestion is a good start point for a workaround (it more or less confirms what I had in mind). I suppose I should look at Eclipse JDT source to find how to proceed with this traversal or try a "possible not so efficient solution" using the available visitors... Thanks for the suggestions! – Victor Sobreira Jul 04 '18 at 20:21
-
Let me expand on "not found this particularly useful". Line numbers on nodes are an accident of how the programmer chose to format the text. She might have formatted it beautifully form the point of view of other readers; (s)he might have produced a stunningly bad layout, say one token per line with random indent distance. What kind of tool would you want to build, that depend on such whimsical choices? – Ira Baxter Jul 04 '18 at 21:44
-
... what I have found useful, is when reporting an error on line N, is to open the source file and fetch lines (N-k...N+k) [e.g., the line and its k lines of surrounding text context) to show to the user. But that's not on the AST. And while useful, it feels ugly to have read the source file line by line to get this information esp. if I might report issues with multiple lines C, K, N, Q, T and read the file sequentially for each report. [We actually do this... by building a source ilne index analoguous to the AST line index proposed above, the first time we have to a file for this.] – Ira Baxter Jul 04 '18 at 21:46