I am using Treesitter to parse Clojure code. Specifically I would like to distinguish between symbols, class names and Java Interop.
This is my grammar:
module.exports = grammar({
name: 'clojure',
extras: $ => [/[\s,]/],
rules: {
program: $ => repeat($._anything),
_anything: $ => choice($.symbol, $.classname, $.member_access, $.new_class),
symbol: $ => $._symbol_chars,
classname: $ => prec.left(3, seq($._symbol_chars, repeat1($._classname_part ))),
_classname_part: $ => prec.right(3, seq($._dot, $._symbol_chars)),
member_access: $ => seq($._dot, $._class_chars),
new_class: $ => prec(2, seq( choice($.symbol, $.classname), $._dot)),
_dot: $ => /\.{1}/,
_symbol_chars: $ => /[a-zA-Z\*\+\!\-_\?][\w\*\+\!\-\?\':]*/,
_class_chars: $ => /[a-zA-Z_]\w*/
}
})
I would expect
foo
java.lang.String
.toUpperCase
java.awt.Point.
to be parsed to
(program (
(symbol)
(classname)
(member_access)
(new_class (classname)))
But Treesitter keeps seeing (new_class (classname)) (classname)
instead of (classname)
for Java.lang.String
. I suppose that I need some kind of greedy matching and have tried prec.right()
in different places to no avail. What am I missing?