1

I am trying to match numbers in Qt for syntax highlighting, and I am trying the following regexes:

"[^a-fA-F_][0-9]+" // For numbers.
"[^a-fA-F_][0-9]+\\.[0-9]+" // For decimal numbers.
"[^a-fA-F_][0-9]+\\.[0-9]+e[0-9a-fA-F]+" // For scientific notation.
"[^a-fA-F_]0[xX][0-9a-fA-F]+" // For hexadecimal numbers.

But the text is matching, for example, [1024, and highlighting the [ too. I wanted to highlight only the 1024 part.

Another problem is that the regex highlights when I type aoe2 and when I type aoe25. I don't want to highlight the number when it is preceded by letters or underscores, because then it would be an identifier.

How can I solve that?

RenatoUtsch
  • 1,449
  • 1
  • 13
  • 20
  • Are there not functions for parsing numbers from strings? I know of at least `strto{ul,d}()` but Qt may have more of that? – fge Jan 05 '13 at 19:25
  • Can you provide some sample text and the matches you want? For example, which of these strings you want to match? "aoe2", "aoe25", "1024", "[1024" – vault Jan 05 '13 at 19:27
  • I want to get the position of the numbers on the string, not the numbers themselves, because then the Qt engine will highlight the characters in those positions. – RenatoUtsch Jan 05 '13 at 19:29
  • 1
    I don't know whether it is supported in C++ Regex or not, but you need negative look-ahead here. – Rohit Jain Jan 05 '13 at 19:30
  • I want to match only C numbers, for example: 1024, 4856, 284.582, 0x48e14, 42.42e42, etc. But the regex should not match when those numbers are in identifers, for example "aoe45", and should match (only the number) when they are preceded by operators, for example "[1024]" or "+24". – RenatoUtsch Jan 05 '13 at 19:31
  • Rohit Jain, yeah, I was looking for negative lookbehind, but I don't know if Qt supports that. I couldn't make that work, maybe I made it wrong. – RenatoUtsch Jan 05 '13 at 19:33
  • Note that `09` is not a valid integer in C - having `0` at the beginning means it's octal and `8` or `9` is not octal digit. – zch Jan 05 '13 at 20:44
  • Argh, I forgot that. Thanks, I fixed it. – RenatoUtsch Jan 05 '13 at 20:54

1 Answers1

4

Well it matches [ because of this statement:

[^a-fA-F_]
This will match anything that is not the letters A-F(any case) or an underscore

Why aren't you just matching the digits if that is what you want ?

For integers:         \b\d+
For decimal numbers:  \b\d+.\d+
For scientific:       \b\d+e\d+
For hexadecimal:      \b[\dA-Fa-F]+

Also as @Jan Dvorak mentions you can use word boundaries \b, to make sure your matches begin at the beginning of a word (or number). See example here: http://regex101.com/r/kC6dK3

Hunter McMillen
  • 59,865
  • 24
  • 119
  • 170