1

I am trying to understand numeric strings in PHP. I have the following code:

var_dump(5 * "10 abc");
var_dump(is_numeric("10 abc"));

Which gives me the output:

int(50)
bool(false)

This confuses me as the string "10 abc" seems to be interpreted as a numeric string in the first expression (hence the int(50) output and no warnings about using a non-numeric value), but when run through the is_numeric() function it returns false, suggesting that it is in fact not a numeric string.

I have spent some time looking through the documentation to understand this behaviour but can't find any concrete answers, can somebody please help to explain what is causing this behaviour?

I am aware PHP 8.0.0 made some changes to what is considered a numeric string, but this is PHP 7.1.33 I am trying to understand right now.

Bradley
  • 369
  • 5
  • 11
  • In the first expression "10 abc" is __casted__ to int, giving you `10`. And this has no relations with `is_numeric` check. – u_mulder Mar 06 '21 at 17:50
  • I suggest to NOT use `is_numeric()` and `empty()`. IMO both are very "spongy". F.e. `empty('0')` is true, which is IMO wrong, because i have a string with a character. About `is_numeric` i would suggest to cast to int, if you expect an int (`(int)$val`). You then could also check if the original value fits the int-casted value. – cottton Mar 06 '21 at 19:13
  • BTW: there are better functions - the `ctype_*` functions. In your case `ctype_digit`. – cottton Mar 06 '21 at 19:18

2 Answers2

4

RFC author of the "Saner numeric string" RFC which got accepted for PHP 8.0 here.

"10 abc" is not a numeric string, but a leading-numeric string, meaning that the beginning of the string looks like a number but it isn't one because gibberish exists at some point in the string (and this includes white-spaces).

Because is_numeric() checks that a value is considered numeric per PHP's definition (which prior to PHP 8.0 meant leading white-spaces followed by a + or - sign and any of an integer, a normal decimal number, or a number in exponential notation), it will return false on strings which are just considered leading-numeric.

However, arithmetic operation try to convert their operands to a proper number type (int or float) and as such "10 abc" gets converted to 10 because PHP will convert the leading-numeric string to it's leading numeric value.

Many more "fun" details and edge cases can be found in the technical background section of the PHP RFC.

Girgias
  • 384
  • 2
  • 8
3

I think the easiest way to understand the behaviour you describe is that just because a string isn't numeric, that does not mean it cannot be coerced or treated as a number.

Your first line code

var_dump(5 * "10 abc");

Treats the string as a number, and once it comes across an invalid character it just ignores everything else after that.

Your other line of code

var_dump(is_numeric("10 abc"));

Actually behaves more intelligent, and asks itself, just like a human might, are we dealing with a numeric string here; the answer to which is no (because of those same invalid characters).

Raxi
  • 2,452
  • 1
  • 6
  • 10