0

I'm implementing a syntax highlighter in Apple's Swift language by parsing .tmlanguage files and applying styles to a NSMutableAttributtedString.

I'm testing with javascript code, a javascript.tmlanguage file, and the monokai.tmtheme theme (both last included in sublime text 3) to check that the syntax get highlighted correctly. By applying each rule (patterns) in the .tmlanguage file in the same order they come, the syntax is almost perfectly highlighted.

The problem I'm having right now is that I don't know how to know that a quote (") should be escaped when it has a backslash before it (\"). Am I missing something in the .tmlanguage file that specifies that?. Other problem is that I have no idea how to know that other rules should be ignored when inside others, for example:

I'm getting double slashes taken as comments when inside strings: "http://stackoverflow.com/" a url is recognised as comment after //

Also double or single quotes are taken as strings when inside comments: // press "Enter" to continue, the word "Enter" gets highlighted as string when should be same color as comments

So, I don't know if there is some priority for some rules over others in the convention, or if there is something in the files that I haven't noticed.

Help please!

Update:

Here is a better example of what I meant by escape quotes:

I'm getting this: enter image description here while all the letters should be yellow except for the escaped sequence (/") which should be blue.

The question is. How do I know that /" should be escaped? The rule for that piece of code is:

enter image description here

OscarVGG
  • 2,632
  • 2
  • 27
  • 34
  • Shouldn't the escape character be \ instead of /? That may explain why your coloring is off... – MattDMo Aug 25 '14 at 21:56
  • Check out the difference between [this screenshot](http://i.stack.imgur.com/Bow0p.png) and [this one](http://i.stack.imgur.com/JxgVx.png). The first one shows forward and backward slashes attempting to escape a double-quote in the middle of a string, highlighted using the built-in JavaScript syntax. The second one shows the same text, but highlighted with [JavaScriptNext](https://sublime.wbond.net/packages/JavaScriptNext%20-%20ES6%20Syntax), which IMO is a much better syntax. Regular JS doesn't show an escape char at all, while JSN does. Interesting... – MattDMo Aug 25 '14 at 22:05
  • That file was extracted from sublime text 3, and it works in sublime, so it should work on my app. BTW, i forgot to mension that When the app trys to match the rule `\\(x\h{2}|[0-2][0-7]{,2}|3[0-6][0-7]|37[0-7]?|[4-7][0-7]?|.)`, I get this error: `"Error Domain=NSCocoaErrorDomain Code=2048 \"The operation couldn’t be completed. (Cocoa error 2048.)\" UserInfo=0x7fc862ea7220 {NSInvalidValue=\\\\(x\\h{2}|[0-2][0-7]{,2}|3[0-6][0-7]?|37[0-7]?|[4-7][0-7]?|.)}"` as I stated in this question: http://stackoverflow.com/questions/25439948/swift-regular-expressions-and-backslashes – OscarVGG Aug 25 '14 at 22:08
  • So should I ignore anything matched by the regexes in the "patterns" field for everything between quotes? What regex should I use for that, `"\\(x\h{2}|[0-2][0-7]{,2}|3[0-6][0-7]|37[0-7]?|[4-7][0-7]?|.)"`? doesn't seem to work... – OscarVGG Aug 25 '14 at 22:12

3 Answers3

1

Maybe I am late to answer this. You can apply the following method.

  1. (Ugly) In your end regex, use ([^/])(") and in your endCaptures, it would be

    1 = string.quote.double.js
    2 = punctuation.definition.string.end.js

  2. If the string must be single line, you can use match=(")(.*)("), captures=

    1 = punctuation.definition.string.begin.js
    2 = string.quote.double.js
    3 = punctuation.definition.string.end.js
    and use your patterns

  3. You can try applyEndPatternLast and see if it is allowed. Set applyEndPatternLast=1 will do.
attempt0
  • 639
  • 5
  • 14
0

The priority is that earlier rules in the file are prioritized over later rules. As an example, in my Python Improved language definition, I have a scope that contains a series of all-caps constants used in Django, a popular Python web framework. I also have a generic constant.other.allcaps.python scope that recognizes (just about) anything in all caps. Since the Django constants rule is before the allcaps rule in the .tmLanguage file, I can color it with a theme using one color, while the later-occurring "highlight everything in all caps" only grabs identifiers that are NOT part of the first list.

Because of this, you should put your "comments" scope(s) as early in the file as possible, then write your parser in such a way that it obeys the rule I described above. However, it's slightly more complicated than that, as I believe items in the repository are prioritized based on where their include line is, not where the repository rule is defined in the file. You may want to do some testing to verify that, though.

Unfortunately I'm not sure what you mean about the escaped quotes - could you expand on that, and maybe add an example or two?

Hope this helps.

MattDMo
  • 100,794
  • 21
  • 241
  • 231
  • Thanks for your answer @MattDMo, it was very helpful. I updated my question so you could better understand what I meant about the escaped quotes – OscarVGG Aug 25 '14 at 21:06
  • Re: "items in the repository are prioritized based on where their include line is, not where the repository rule is defined in the file" Yes, this is correct – David J. Feb 25 '21 at 16:30
0

Assuming that / is the correct character for escaping a double quote mark, the following should work:

    "str_double_quote": {
        "begin": "\"",
        "end": "\"",
        "name": "string.quoted.double.swift",
        "patterns": [
            {
                "name": "constant.character.escape.swift",
                "match": "/[\"/]"
            }
        ]
    }

You can match an escaped double quote mark (/") and a literal forward slash (//) in the patterns to consume them before the end marker is used to handle them.

If the character for escaping is actually a backslash, then the tricky bit is that there are two levels of escaping, for the JSON encoding as well as the regular expression syntax. To match \", the regular expression requires you to escape the backslash (\\"). JSON requires you to escape backslashes and double quotes, resulting in \\\\\" in a TextMate JSON grammar file. The match expression would thus be \\\\[\"\\\\].

CodeManX
  • 11,159
  • 5
  • 49
  • 70