8

When I type int("1.7") Python returns error (specifically, ValueError). I know that I can convert it to integer by int(float("1.7")). I would like to know why the first method returns error.

smci
  • 32,567
  • 20
  • 113
  • 146
user3140972
  • 995
  • 6
  • 11
  • 1
    Integer versus floating-point numbers behave differently on computers. It's rare to mix them up for the same purpose. So Python's behavior prevents you from making errors. – Nayuki Oct 13 '15 at 22:54
  • 1
    I its because int treats strings differently than floats ... it truncates floats ... but it checks strings for just digits (whitespace on the ends is ok) – Joran Beasley Oct 13 '15 at 22:54
  • Python tries to prevent subtle bugs, and this feature would encourage them. Imagine: you ask Bob for his age. Bob thinks "I'm turning 18 next month" and enters 17.9, which your code wasn't expecting. Do you want to just throw out that 0.9, or do you want to signal an error, so that either Bob fixes his input or you fix the code? One of Python's mottos is "Errors should not pass silently Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess". Your feature goes against that guideline, so Python doesn't do it. For more words of wisdom, type 'import this'. – Mark VY Oct 14 '15 at 04:56

4 Answers4

9

From the documentation:

If x is not a number or if base is given, then x must be a string or Unicode object representing an integer literal in radix base ...

Obviously, "1.7" does not represent an integer literal in radix base.

If you want to know why the python dev's decided to limit themselves to integer literals in radix base, there are a possible infinite number of reasons and you'd have to ask Guido et. al to know for sure. One guess would be ease of implementation + efficiency. You might think it would be easily for them to implement it as:

  1. Interpret number as a float
  2. truncate to an integer

Unfortunately, that doesn't work in python as integers can have arbitrary precision and floats cannot. Special casing big numbers could lead to inefficiency for the common case1.

Additionally, forcing you do to int(float(...)) has the additional benefit in clarity -- It makes it more obvious what the input string probably looks like which can help in debugging elsewhere. In fact, I might argue that even if int would accept strings like "1.7", it'd be better to write int(float("1.7")) anyway for the increased code clarity.

1Assuming some validation. Other languages skip this -- e.g. ruby will evaluate '1e6'.to_i and give you 1 since it stops parsing at the first non-integral character. Seems like that could lead to fun bugs to track down ...

mgilson
  • 300,191
  • 65
  • 633
  • 696
  • 2
    I don't think it's anything to do with `int` being arbitrary precision. In Ruby, `"1.7".to_i` == `1`. They just stop parsing at the first invalid character. I think it's just one of Guido's calls to help prevent hard to detect bugs. – John La Rooy Oct 13 '15 at 23:51
  • 2
    Perhaps. But, it seems like it'd be a pretty surprising day for me if `int('1 hundred baboons')` resulted in `1` (which, is what the ruby equivalent would give you if I'm understanding your comment correctly). I suppose that parsing without any validation can be much quicker -- but is that really what you want? (and the answer to that question is also one reason that we have so many programming languages with different ways of thinking, etc.) – mgilson Oct 13 '15 at 23:55
  • It's not what _I_ want. I _wanted_ ruby to raise an exception when I was doing something like this the other day. The magic gets in the way sometimes. I am happy to be explicit in cases like these. – John La Rooy Oct 13 '15 at 23:58
  • @JohnLaRooy -- Right. From what I understand, `ruby` is a lot more fast and loose than python (wanna monkey patch the integer type? Sure, why not?...). Anyway, I guess my statement about inefficiency was assuming a level of validation that I consider to be sane (which is likely to be different than the levels of validation that Matsumoto considers to be sane ;-) – mgilson Oct 14 '15 at 00:00
  • 1
    Also, It sounds like in ruby, `'1e6'.to_i == 1` which is almost certainly not what the user meant. . . :-) – mgilson Oct 14 '15 at 00:07
  • I don't see that special-casing big floats would lead to inefficiency for the common case... you simply raise a FloatPrecisionError if you see too much precision in the mantissa; you don't need to actually parse it. As opposed to always raising a ValueError on a float like currently do. (We could make the decimal case just as fast by simply checking string length of the mantissa digits, that's trivially easy). Anyway I believe the reason is simply **backward-compatibility** for a long-established idiom (EAFTP on casting to int or float), per my answer. – smci Oct 14 '15 at 00:53
2

We have a good, obvious idea of what "make an int out of this float" means because we think of a float as two parts and we can throw one of them away.

It's not so obvious when we have a string. Make this string into a float implies all kinds of subtle things about the contents of the string, and that is not the kind of thing a sane person wants to see in code where the value is not obvious.

So the short answer is: Python likes obvious things and discourages magic.

Chad Miller
  • 1,435
  • 8
  • 11
1

Here is a good description of why you cannot do this found in the python documentation.

https://docs.python.org/2/library/functions.html#int

If x is not a number or if base is given, then x must be a string or Unicode object representing an integer literal in radix base. Optionally, the literal can be preceded by + or - (with no space in between) and surrounded by whitespace. A base-n literal consists of the digits 0 to n-1, with a to z (or A to Z) having values 10 to 35. The default base is 10. The allowed values are 0 and 2-36. Base-2, -8, and -16 literals can be optionally prefixed with 0b/0B, 0o/0O/0, or 0x/0X, as with integer literals in code. Base 0 means to interpret the string exactly as an integer literal, so that the actual base is 2, 8, 10, or 16.

Basically to typecast to an integer from a string, the string must not contain a "."

digitaLink
  • 458
  • 3
  • 17
  • Strictly, it's not just about whether the number contains '.' Scientific notation could also break the int-ness of the string. Examples: `int("6e7")` is not an integer (base-10). However `int("6e7",16)` = 1767 is an integer in base-16 (or any base>=15). But `int("6e-7")` is never an int in any base. – smci Oct 14 '15 at 00:41
1

Breaks backwards-compatibility. It is certainly possible, however this would be a terrible idea since it would break backwards-compatibility with the very old and well-established Python idiom of relying on a try...except ladder ("Easier to ask forgiveness than permission") to determine the type of the string's contents. This idiom has been around and used since at least Python 1.5, AFAIK; here are two citations: [1] [2]

s = "foo12.7"
#s = "-12.7"
#s = -12

try:
    n = int(s) # or else throw an exception if non-integer...
    print "Do integer stuff with", n
except ValueError:
    try:
        f = float(s) # or else throw an exception if non-float...
        print "Do float stuff with", f
    except ValueError:
        print "Handle case for when s is neither float nor integer"
        raise # if you want to reraise the exception

And another minor thing: it's not just about whether the number contains '.' Scientific notation, or arbitrary letters, could also break the int-ness of the string. Examples: int("6e7") is not an integer (base-10). However int("6e7",16) = 1767 is an integer in base-16 (or any base>=15). But int("6e-7") is never an int.

(And if you expand the base to base-36, any legal alphanumeric string (or Unicode) can be interpreted as representing an integer, but doing that by default would generally be a terrible behavior, since "dog" or "cat" are unlikely to be references to integers).

smci
  • 32,567
  • 20
  • 113
  • 146