1

I'm trying to create a tree-sitter for a Minecraft function grammar.

The structure of the language looks like this:

command @e[key=value] other args

I'm having an issue with the value in the second argument (the target selector) in the above example. This value can be many things, like strings, numbers, booleans, and two similar object structures (NBT and scoreboard object).

Here are examples of each:

NBT

{key:value}

Scoreboard Object

{key=number} // where number is: N, ..N, N.., or N..N

My grammar file contains the following code:

// unrelated code removed

module.exports = grammar({
  name: "mcfunction",
  rules: {
    root: $ => repeat(
      choice(
        $.command
      )
    ),
    command: $ => prec.right(seq(
      field("command_name", $.identifier),
      repeat(
        choice(
          $.selector
        )
      ),
      "\n"
    )),
    identifier: $ => /[A-Za-z][\w-]+/,
    number: $ => prec(1, /-?\d+(\.\d+)?/),
    boolean: $ => choice(
      "true",
      "false"
    ),
    string: $ => seq(
      "\"",
      repeat(
        choice(
          $._escape_sequence,
          /[^"]/
        )
      ),
      "\""
    ),
    _escape_sequence: $ => seq("\\", "\""),
    selector: $ => seq(
      token(
        seq(
          "@",
          choice(
            "p", "a", "e", "s", "r"
          )
        )
      ),
      optional(
        seq(
          token.immediate("["),
          optional(
            repeat(
              seq(
                $.selector_option,
                optional(",")
              )
            )
          ),
          "]"
        )
      ),
    ),
    selector_option: $ => seq(
      $.selector_key,
      "=",
      $.selector_value
    ),
    selector_key: $ => /[a-z_-]+/,
    selector_value: $ => choice(
      $.item,
      $.path,
      $.selector_key,
      $.selector_number,
      $.number,
      $.boolean,
      $.selector_object
    ),
    selector_number: $ => prec.right(1, choice(
      seq(
        "..",
        $.number
      ),
      seq(
        $.number,
        "..",
        $.number
      ),
      seq(
        $.number,
        ".."
      ),
      $.number
    )),
    selector_object: $ => choice(
      seq(
        "{",
        repeat(
          seq(
            $.selector_score,
            optional(",")
          )
        ),
        "}"
      ),
      seq(
        "{",
        repeat(
          seq(
            $.selector_nbt,
            optional(",")
          )
        ),
        "}"
      )
    ),
    selector_nbt: $ => seq(
      $.nbt_object_key,
      ":",
      $.nbt_object_value
    ),
    selector_score: $ => seq(
      field("selector_score_key", $.selector_key),
      "=",
      field("selector_score_value", $.selector_number)
    ),
    _namespace: $ => /[a-z_-]+:/,
    item: $ => seq(
      $._namespace,
      $.selector_key
    ),
    path: $ => seq(
      choice($.item, /[a-z_]+/),
      repeat1(
        token("/", /[a-z_]/)
      )
    ),
    nbt: $ => choice(
      $.nbt_array,
      $.nbt_object
    ),
    nbt_object: $ => seq(
      "{",
      repeat(
        seq(
          $.nbt_object_key,
          ":",
          $.nbt_object_value,
          optional(",")
        )
      ),
      "}"
    ),
    nbt_array: $ => seq(
      "[",
      repeat(
        seq(
          $.nbt_object_value,
          optional(",")
        )
      ),
      "]"
    ),
    nbt_object_key: $ => choice(
      $.string,
      $.number,
      $.identifier
    ),
    nbt_object_value: $ => choice(
      $.string,
      $.nbt_number,
      $.boolean,
      $.nbt
    ),
    nbt_number: $ => seq(
      $.number,
      field("nbt_number_suffix", optional(choice("l","s","d","f","b")))
    )
  }
});

However, if I compile and parse test @e[scores={example=1..}], I get:

(root [0, 0] - [6, 0]
  (command [0, 0] - [1, 0]
    command_name: (identifier [0, 0] - [0, 4])
    (selector [0, 5] - [0, 29]
      (selector_option [0, 8] - [0, 28]
        (selector_key [0, 8] - [0, 14])
        (selector_value [0, 15] - [0, 28]
          (selector_object [0, 15] - [0, 28]
            (ERROR [0, 16] - [0, 27]
              (nbt_object_key [0, 16] - [0, 23]
                (identifier [0, 16] - [0, 23]))))))))
tests/test.mcfunction  0 ms    (ERROR [0, 16] - [0, 27])

Expected: instead of ERROR, it should be selector_score, and there should be a score_key and score_value.

This does not happen if I remove the selector_nbt sequence from selector_object. However, if running the parse (with both sequences or just the selector_nbt) on a command using nbt data, there are no errors.

What am I doing wrong?

theusaf
  • 1,781
  • 1
  • 9
  • 19

1 Answers1

0

I solved this by using a choice of the two conflicting keys, something like this:

choice(
  alias($.key_1, $.key_2),
  $.key_2
)

ahlinc on GitHub answered:

You can fix your error for the above grammar by assigning lexer precedence for the selector_key terminal over the identifier terminal like:

selector_key: $ => token(prec(1, /[a-z_-]+/)),

But you need to note that you use regexps that clashes:

identifier: $ => /[A-Za-z][\w-]+/,
selector_key: $ => token(prec(1, /[a-z_-]+/)),

If it's impossible to rewrite the above regexps to don't have conflicts in them then you may need a workaround described here: #1287 (reply in thread)

theusaf
  • 1,781
  • 1
  • 9
  • 19