0

I wanna capture logical operators from ooRexx with regex in a .cson file because I want support syntax highlighting of ooRexx with the Atom editor. Those are the operators I try to cover:

>= <= \> \< \= >< <> == \== // && || ** ¬> ¬< ¬= ¬== >> << >>= \<< ¬<< \>> ¬>> <<=

And this is the regex part in the cson file:

'match': '\\+ | - | [\\\\] | \\/ | % | \\* | \\| | & |=|¬|>|<|
>= | <= | ([\\\\]>) | ([\\\\]<) | ([\\\\]=) | >< | <> | == | ([\\\\]==) | 
\\/\\/ | && | \\|\\| | \\*\\* | ¬> | ¬< | ¬= | ¬== | >> | << | >>= | ([\\\\]<<) | ¬<< |
([\\\\]>>) | ¬>> | <<='

I'm struggling with the slashes (forward and backward) and also with the double **My knowledge about regex is very basic, to say it nicely. Is there somebody who can help me with that?

Felix Dombek
  • 13,664
  • 17
  • 79
  • 131
  • Try to escape backslashes just once, so that every `\ ` in the operator becomes `\\ ` in the regex. You should also never need `[ ]`. Also state what the actual error is that you encounter. – Felix Dombek Mar 08 '17 at 22:16

1 Answers1

0

You have spaces around the pipe bars: these spaces are counted in the regular expression. So when you write something like | \*\* |, the double asterisks get caught, but only if they are surrounded by a space on each side, and not if they're affixed to a word or at the beginning/end of a line. Same issue with the slashes — I have tested it, and it does seem to catch them for me, but only as long as your slashes (or asterisks) are between two spaces.

A few other things to keep in mind:

  • You shouldn't need the square brackets around backslashes; they're useful to provide classes of possible characters to match. For instance, [<>]= will catch both >= and <=. Writing [\\] is equivalent to writing \\ directly because \\ counts as a single character, due to the first escaping backslash. Similarly, your parentheses here are not being used; see grouping.
  • Also think of using repetition operators like + and *. So \\>+ will catch both \> and \>>.
  • Finally, the question mark will help you avoid repetition, by marking the previous character (or group of characters, in square brackets) as optional. ==? will match both = and ==.

You can group together a LOT of your statements with these three tricks combined… I'll leave that exercise to you!

Just another hint when developing long regular expressions — use a tester like Regex101 or similar with a test file to see your changes in real time, and debuggers like Regexper will help you understand how your regular expression is parsed.

Victor
  • 98
  • 5
  • Thank you so much. I wasn't aware that the spaces are part of the regex pattern. Now I have it working. – Atlantaner Mar 10 '17 at 12:49
  • @Atlantaner great! If that worked for you please [mark the question as resolved](http://stackoverflow.com/help/someone-answers). – Victor Mar 11 '17 at 13:15