0

I'm using AvalonEdit in an app that runs my own custom-built language. I have defined a highlighting.xml file that works just fine.

Now I am trying to extend it according to:

the next word appearing after "method" is colored blue.

I came up with this regex to do so:

(?s)(?<=method )(.+?)(?= )

And tested it with this input:

via method AMethod on interface

Which works fine with http://regexstorm.net/tester.

Then I tried the following rules, but none worked. With them nothing gets highlighted anymore.

<Rule foreground="DarkBlue">
  \(?s)(?<=method )(.+?)(?= )
</Rule>

<Rule foreground="DarkBlue">
  \(?s)(?&lt;=method )(.+?)(?= )
</Rule>

<Rule foreground="DarkBlue">
  (?s)(?<=method )(.+?)(?= )
</Rule>

This one did not break the highlighting, but did not work either:

<Rule foreground="DarkBlue">
  (?s)(?&lt;=method )(.+?)(?= )
</Rule>

Is what I am trying to do possible? Is the regex correct? I am a complete ignorant on regex.

Thanks in advance.

Update for Divisadero's answer

This ones break the Highlighting.

<Rule foreground="DarkBlue">
  \(?s)(?<=method )([^' ']+)
</Rule>

<Rule foreground="DarkBlue">
  \(?s)(?&lt;=method )([^' ']+)
</Rule>

<Rule foreground="DarkBlue">
  (?s)(?<=method )([^' ']+)
</Rule>

This ones don't break the Highlighting but don't work:"

<Rule foreground="DarkBlue">
  (?s)(?&lt;=method )([^' ']+)
</Rule>
JoanComasFdz
  • 2,911
  • 5
  • 34
  • 50
  • Are `via`, `method`, `on` and `interface` defined as keywords in your language? That is, are they all included in the `` rule? Do you have a rule for user-defined names in general, with a regex like `\b\w+\b`, or `\b[A-Z]\w*\b`? – Alan Moore Mar 09 '16 at 09:55
  • Only method and interface are keywords colored in green. Also there is this rule at the end of the file: \b0[xX][0-9a-fA-F]+ # hex number | \b ( \d+(\.[0-9]+)? #number with optional floating point | \.[0-9]+ #or just starting with floating point ) ([eE][+-]?[0-9]+)? # optional exponent – JoanComasFdz Mar 09 '16 at 10:00

2 Answers2

1

If all you want is to highlight name after method, use:

(?s)(?<=method )([a-zA-Z0-9])+  

'[a-zA-Z0-9]+' part should accept whatever symbols you accept in the name.

And if you really somehow needs everything but space, just use:

(?s)(?<=method )([^' ']+) 
Divisadero
  • 895
  • 5
  • 18
  • Thanks for the answer, this regex look much simpler and they still work in http://regexstorm.net/tester. Unfortunately not in the highlights.xml. I updated the question. – JoanComasFdz Mar 09 '16 at 09:45
1

It doesn't surprise me that rules based on lookbehind don't work. A syntax highlighter is just a glorified lexer, which means it doesn't use regexes the way you might expect. Instead of searching for a match, it steps through the string manually, always acting as if (1) the current position is the beginning of the string, and (2) the regex has a start anchor (\A) on the front of it. So lookbehinds aren't illegal, but they don't work; positive lookbehinds like (?<=method ) always fail, and negative lookbehinds always succeed.

But you shouldn't need a lookbehind anyway. In lexing most languages, you can identify a user-defined name because it looks like a name and it hasn't already been consumed by another rule (string, comment, keyword...). In your example, via, method, on and interface all look like keywords, so they should be included in your <Keywords> rule. Then you can add another rule for user-defined names, like:

<!-- name -->
<Rule foreground="DarkBlue">
  \b\w+\b
</Rule>

(That regex is just a guess, but--fun fact--the \w shorthand was invented for exactly this purpose.) If you want to differentiate between method names and other names, you can add another rule, before that one, with a more specific regex:

<!-- method name -->
<Rule foreground="LightBlue">
  \b[A-Z]\w*\b
</Rule>

By the way, the (?s) modifier allows the dot (.) to match any character including newlines. It probably has no effect here, since the highlighter processes one line at a time, but it's definitely not doing any good.

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
  • Thanks a lot for all that information. I added "via" and "on" as keywords with no color specified. Now every single word is highlighted, even if its not between via and on. – JoanComasFdz Mar 09 '16 at 11:16
  • You are using different colors for keywords and other words, aren't you? What other words are there, that you don't want highlighted? Do they have different formats, e.g., capitalized vs. not capitalized? – Alan Moore Mar 09 '16 at 13:47
  • Well i am using GIVEN WHEN THEN structure, where at some point methods and interface names must be declared. But its not very strict so people can write sentences. The only thing I know is that whenever this pattern is written "[unit][space][%UNITNAME%][via method][%METHODNAME%][on interface][%INTERFACE NAME%]" then I want the method name and interface name diferently coloured. And "unit", "method" and "interface" are keywords. – JoanComasFdz Mar 09 '16 at 14:10
  • Maybe it would be best not to highlight them. If you color all the keywords, other words will stand out because they're *not* colored. – Alan Moore Mar 09 '16 at 17:35
  • Yeah but I also implemented that CONTROL+CLICK searches for such method / interface in the current solution. That's what I am trying to reflect. Since the default color for methods in the editor is black I can accept to leave it, but I would really like to highlight the interface name. – JoanComasFdz Mar 09 '16 at 18:03