9

What's the common practice to deal with Integer overflows like 999999*999999 (result > Integer.MAX_VALUE) from an Application Development Team point of view?

One could just make BigInt mandatory and prohibit the use of Integer, but is that a good/bad idea?

Leif Wickland
  • 3,693
  • 26
  • 43
IODEV
  • 1,706
  • 2
  • 17
  • 20

4 Answers4

13

If it is extremely important that the integer not overflow, you can define your own overflow-catching operations, e.g.:

def +?+(i: Int, j: Int) = {
  val ans = i.toLong + j.toLong
  if (ans < Int.MinValue || ans > Int.MaxValue) {
    throw new ArithmeticException("Int out of bounds")
  }
  ans.toInt
}

You may be able to use the enrich-your-library pattern to turn this into operators; if the JVM manages to do escape analysis properly, you won't get too much of a penalty for it:

class SafePlusInt(i: Int) {
  def +?+(j: Int) = { /* as before, except without i param */ }
}
implicit def int_can_be_safe(i: Int) = new SafePlusInt(i)

For example:

scala> 1000000000 +?+ 1000000000
res0: Int = 2000000000

scala> 2000000000 +?+ 2000000000
java.lang.ArithmeticException: Int out of bounds
    at SafePlusInt.$plus$qmark$plus(<console>:12)
    ...

If it is not extremely important, then standard unit testing and code reviews and such should catch the problem in the large majority of cases. Using BigInt is possible, but will slow your arithmetic down by a factor of 100 or so, and won't help you when you have to use an existing method that takes an Int.

Rex Kerr
  • 166,841
  • 26
  • 322
  • 407
  • Hi Rex, thanks for the prompt reply! A reasonable solution although it may require some re-factoring. How hard would you say is tampering with the Integer base class adding over-flow check flag to the standard "operator" functions considering the complex type system? – IODEV Mar 26 '12 at 16:16
  • Btw, regarding ""+?+"", is there standard a Scala naming conversion one should use? /Thanks in advance – IODEV Mar 26 '12 at 16:23
  • @IODEV - You could add a wrapper class, but you can't sensibly tamper with the base class because it actually maps to the `int` primitive in the JVM and therefore has a lot of special compiler magic. The choice of `?` was arbitary by me; starting with `+` will keep the operator precedence the same, and I like symmetry (and others do enough so that while I wouldn't call this a _convention_ it is at least familiar), so I added another `+` at the end. `+@` would work just as well. – Rex Kerr Mar 26 '12 at 17:04
6

By far the most common practice regarding integer overflows is that programmers are expected to know that the issue exists, to watch for cases where they might happen, and to make the appropriate checks or rearrange the math so that overflows won't happen, things like doing a * (b / c) rather than (a * b) / c . If the project uses unit test, they will include cases to try and force overflows to happen.

I have never worked on or seen code from a team that required more than that, so I'm going to say that's good enough for almost all software.

The one embedded application I've seen that actually, honest-to-spaghetti-monster NEEDED to prevent overflows, they did it by proving that overflows weren't possible in each line where it looked like they might happen.

mjfgates
  • 3,351
  • 1
  • 18
  • 15
  • 3
    Underflow can be equally bad, which `a*(b/c)` can give you. Should generally do `((a*b).toLong/c).toInt` or the equivalent with `Double`. – Rex Kerr Mar 26 '12 at 17:05
  • As with overflows, the way to deal with underflows is for the programmer to be aware of the possibility, and to use his or her JUDGEMENT. Widening your arguments and then narrowing the result every time you do math does Bad Things to performance and code readability, and doesn't work in cases where the final result over- or underflows. – mjfgates Mar 27 '12 at 02:29
  • Depends how performance-critical the code is. Very low performance can get by with arbitrary-precision integers. Decent performance can get by with widening/narrowing. High performance needs the programmer to understand what allowed values are so widening/narrowing isn't needed. Ultra-high performance would be best served if you could avoid the division entirely, somehow. – Rex Kerr Mar 27 '12 at 02:59
6

If you're using Scala (and based on the tag I'm assuming you are), one very generic solution is to write your library code against the scala.math.Integral type class:

def naturals[A](implicit f: Integral[A]) =
  Stream.iterate(f.one)(f.plus(_, f.one))

You can also use context bounds and Integral.Implicits for nicer syntax:

import scala.math.Integral.Implicits._

def squares[A: Integral] = naturals.map(n => n * n)

Now you can use these methods with either Int or Long or BigInt as needed, since instances of Integral exist for all of them:

scala> squares[Int].take(10).toList
res0: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100)

scala> squares[Long].take(10).toList
res0: List[Long] = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100)

scala> squares[BigInt].take(10).toList
res1: List[BigInt] = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100)

No need to change the library code: just use Long or BigInt where overflow is a concern and Int otherwise.

You will pay some penalty in terms of performance, but the genericity and the ability to defer the Int-or-BigInt decision may be worth it.

Travis Brown
  • 138,631
  • 12
  • 375
  • 680
4

In addition to simple mindfulness, as noted by @mjfgates, there are a couple of practices that I always use when dealing with scaled-decimal (non-floating-point) real-world quantities. This may not be on point for your particular application - apologies in advance if not.

First, if there are multiple units of measure in use, values must always clearly identify what they are. This can be by naming convention, or by using a separate class for each unit of measure. I've always just used names - a suffix on every variable name. In addition to eliminating errors from confusion over the units, it encourages thinking about overflow because the measures are less likely to be thought of as just numbers.

Second, my most frequent source of overflow concern is usually rescaling - converting from one measure to another - when it requires a lot of significant digits. For example, the conversion factor from cm to inches is 0.393700787402. In order to avoid both overflow and loss of significant digits, you need to be careful to multiply and divide in the right order. I haven't done this in a long time, but I believe what you want is something like:

Add to Rational.scala, from The Book:

  def rescale(i:Int) : Int = {
      (i * (numer/denom)) + (i/denom * (numer % denom))

Then you get as results (shortened from a specs2 test):

  val InchesToCm = new Rational(1000000000,393700787)
  InchesToCm.rescale(393700787) must_== 1000000000
  InchesToCm.rescale(1) must_== 2

This doesn't round, or deal with negative scaling factors. A production implementation may want to factor out numer/denom and numer % denom.

Ed Staub
  • 15,480
  • 3
  • 61
  • 91
  • +1 for tying the unit of measure to the variable. Having the code yell at me if I tried to mix "character count" and "byte count" saved my bacon one year when I was coping with a pile of horrific DBCS-to-Unicode conversions. – mjfgates Mar 27 '12 at 02:32