8

In C and Java, there are defined constants representing the maximum and minimum values an integer can hold.

Are there such constants in awk? If so, what are their names?

The awk manual indicates that awk can support arbitrary precision integer arithmetic with -M, but I'd like to know about the bounds on integers when we do not specify -M.

merlin2011
  • 71,677
  • 44
  • 195
  • 329
  • 2
    It's implementation-defined (you'd have to write a test-program to determine the available precision). By the way, that's the **`gawk`** manual. For **awk** you have to go to POSIX. – Thomas Dickey Apr 28 '18 at 00:38

2 Answers2

6

Not really something I've considered before so I may be barking up the wrong tree completely but since awk uses double-precision floating-point numbers by default, maybe what you're looking for is based on the value of PREC in gawk (see https://www.gnu.org/software/gawk/manual/gawk.html#Setting-precision). Look:

$ awk 'BEGIN{print PREC}'
53

$ awk 'BEGIN{print (2^52)}'
4503599627370496
$ awk 'BEGIN{print (2^52)+1}'
4503599627370497

$ awk 'BEGIN{print (2^PREC)}'
9007199254740992
$ awk 'BEGIN{print (2^PREC)+1}'
9007199254740992

Notice how integer arithmetic fails when you try to go beyond 2^PREC? So maybe 2^PREC is a reasonable value to use for a MAX_INT equivalent and you could derive a MIN_INT similarly. Think about it, try it, see if it makes sense for your needs....

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
5

High integers in current (g)awk are oddly broken without -M. It is easy to spot that BEGIN {print 2^1024} yields inf, whereas BEGIN {print 2^1023} works. One would therefore assume that the maximum integer in this particular implementation is 21024 − 1. Yet this is not the case.

A simple experiment, based on the fact that 21024 − 1 = 21023 + 21022 + ⋯ + 21 + 20:

BEGIN {for (i = 1023; i >= 0; --i) sum += 2^i; print sum}

This^^^ yields infinity, surprisingly enough. So, at which point do we need to stop adding the powers of 2 in order to obtain a valid result? On my systems the limit appears to be 971 — try 970 and it sums to infinity.

BEGIN {for (i = 1023; i >= 971; --i) sum += 2^i; print sum}

This^^^ prints 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.

The value has a surprising property in awk: Whatever you add to it, up to a certain number, does not change it any more. (Try to print (e.g.) sum + 3.) Incrementing it (although it appears to remain unchanged, based on the print output) beyond a certain threshold yields infinity, eventually. This is definitely a bug.

As for the original sum above (21023 + ⋯ + 2971), it is still correct in awk. Things start to fall apart once you try to increase that sum further. For example (and surprisingly), this still yields the same result as above:

BEGIN {for (i = 1023; i >= 971; --i) sum += 2^i
       for (i = 969; i >= 0; --i) sum += 2^i
       print sum}

Checking both sums with Python is easy:

sum = 0

for i in range(971, 1024):
  sum += 2**i
print(sum)  # awk gets this right

for i in range(0, 970):
  sum += 2**i
print(sum)  # awk without -M gets this wrong

All in all, I think I will be setting -M in awk all the time from now on!

Andrej Podzimek
  • 2,409
  • 9
  • 12
  • Good investigation. At a glance, the surprising properties all probably make sense if you assume that variables are all floating point with double precision. Note: `-M` or `--bignum` (https://www.gnu.org/software/gawk/manual/html_node/Options.html) is likely specific to `gawk`. Many distros (including Ubuntu) may default to other versions such as `mawk`. – mwfearnley Feb 01 '21 at 14:25