3

I have a string in CMake which I got somehow, in a variable MYVAR. I want to check whether that string is an integral number - possibly with whitespace. I have an ugly way to do it:

string(REGEX MATCH "^[ \t\r\n]*[0-9]+[ \t\r\n]*$" MYVAR_PARSED "${MYVAR}")
if ("${MYVAR_PARSED}" STREQUAL "")
    message(FATAL_ERROR "Oh no!" )
endif()
# Now I can work with MYVAR as a number

is there a better way? Or - should I just wrap this in a function?

Note: I'm using the CMake regex syntax as documented here.

einpoklum
  • 118,144
  • 57
  • 340
  • 684
  • "I have an ugly way to do it" - You name "ugly" a regex containing only simple `*`, `+` and `?`? What is a *non-ugly* regex in that case? A regex which matches an exact string? Or do you want to check a string **without** using a regex? Well, you could iterate over string characters and apply simple state-machine, but I don't think that given way is simpler then the regex. Not sure what do you want from us. – Tsyvarev Jul 01 '21 at 19:56
  • @Tsyvarev: Non-ugly would be: `if (IS_NUMBER "${MYVAR})` or something like that... also, not having to use auxiliary variable, even if I do use a regex. – einpoklum Jul 01 '21 at 20:04
  • "Non-ugly would be: `if (IS_NUMBER "${MYVAR})`" - Do you really expect CMake to have ready check for a type which is rarely used (decimal with exponent using `E`)? Before your question I don't even think about such representation. "also, not having to use auxiliary variable, even if I do use a regex." - What about `if(MYVAR MATCHES "")`? This is perfectly valid check in CMake. – Tsyvarev Jul 01 '21 at 20:28
  • @Tsyvarev: You make a valid point. SO, let's forget about the exponent. See edit. – einpoklum Jul 01 '21 at 21:07

1 Answers1

3

Option 1: Exact match

I want to check whether that string is an integral number - possibly with whitespace.

If this is the exact spec I need, then I would check it like so:

string(STRIP "${MYVAR}" MYVAR_PARSED)
if (NOT MYVAR_PARSED MATCHES "^[0-9]+$")
    message(FATAL_ERROR "Expected number, got '${MYVAR}'")
endif ()

This first removes whitespace from MYVAR, storing the result in MYVAR_PARSED. Then, it checks that MYVAR_PARSED is a non-empty sequence of digits and errors out if it is not.

I think doing this ad-hoc is fine, but if you want a function:

function(ensure_int VAR VALUE)
  string(STRIP "${VALUE}" parsed)
  if (NOT parsed MATCHES "^[0-9]+$")
    message(FATAL_ERROR "Expected number, got '${VALUE}'")
  endif()
  set(${VAR} "${parsed}" PARENT_SCOPE)
endfunction()

ensure_int(MYVAR_PARSED "${MYVAR}")

Option 2: Looser match

However, the following solution might in some cases be more robust, depending on your requirements:

math(EXPR MYVAR_PARSED "${MYVAR}")

This will interpret the value of MYVAR as a simple mathematical expression over 64-bit signed C integers. It will interpret 0x-prefixed numbers in hex. It recognizes most C arithmetic operators.

On the other hand, it might be too permissive: this solution will accept things like 0 + 0x3. It will also variously warn or error depending on how broken the expression is. However, this might not be an issue if you subsequently validate the range of the number or something. You could, for instance, check if (MYVAR_PARSED LESS_EQUAL 0) and then error out if so.

Documentation for the math command: https://cmake.org/cmake/help/latest/command/math.html

Alex Reinking
  • 16,724
  • 5
  • 52
  • 86
  • The 1st solution is what I'm using, only stricter w.r.t. to whitespace, so doesn't really answer my question. The second solution - if I use it, I still need to check what happens to `EXPR`. I don't mind the over-permissiveness w.r.t. notation, but - this will accept non-integral numbers. And like I said, it's a multi-command solution. So, +1 for effort but -1 for poor solutions :-( – einpoklum Jul 02 '21 at 07:47
  • @einpoklum - I don't think the poor solutions are _my fault_. The CMake language isn't built to handle numbers outside of the `math` command and the `if` numeric comparisons (which are also very permissive w.r.t. numeric prefixes). I'll also add that `math()` doesn't handle floating point _at all_. – Alex Reinking Jul 02 '21 at 09:31
  • You are officially absolved of the solutions being poor :-) ... interesting about no floating-point in math; that makes it a little better. And with the function wrapping this - +1! – einpoklum Jul 02 '21 at 09:39