0

I am trying to solve Project Euler+ #97 from Hackerrank. The problem asks to calculate the last 12 digits of A x B ** C + D. My attempt was to use the modular exponentiation mod 10 ** 12 from Wikipedia in order to efficiently calculate the last 12 digits and avoid overflow. However, for all cases aside from the sample 2 x 3 ** 4 + 5 I am getting wrong. According to the constraints there should be no overflow for unsigned long long.


The problem:

Now we want to learn how to calculate some last digits of such big numbers. Let's assume we have a lot of numbers A x B ** C + D and we want to know last 12 digits of these numbers.

Constraints:

  • 1 ≤ T ≤ 500000
  • 1 ≤ A, B, C, D ≤ 10 ** 9

Input: First line contains one integer T - the number of tests. T lines follow containing 4 integers (A, B, C and D) each.

Output: Output exactly one line containing exactly 12 digits - the last 12 digits of the sum of all results. If the sum is less than 10 ** 12 print corresponding number of leading zeroes then.


My attempt in C

#include <stdio.h>

int main() {
    const unsigned long long limit = 1000000000000;
    int cases;
    for (scanf("%d", &cases); cases; --cases) {
        // mult = A, base = B, exp = C, add = D
        unsigned long long mult, base, exp, add;
        scanf("%llu %llu %llu %llu", &mult, &base, &exp, &add);
        base = base % limit;
        while (exp) {
            if (exp & 1) {
               mult = (mult * base) % limit;
            }
            exp >>= 1;
            base = (base * base) % limit;
        }
        printf("%012llu\n", (mult + add) % limit);
    }
    return 0;
}
Neil A.
  • 841
  • 7
  • 23
  • `(mult * base) % limit` really? not `((mult % limit) * (base % limit)) % limit` or something? – Will Ness Jun 19 '17 at 23:12
  • Are you allowed to use 64b integer arithmetic? If so, then can't you just compute each value modulo 2^64 and use a 64b accumulator to add them up? Then modulo your answer 10^12 and printf using "%012ld" format? – jschultz410 Jun 19 '17 at 23:12
  • @WillNess: That is not necessary because whenever `mult` is multiplied the result is reduced modulo `limit` at the end; likewise for `base` – Neil A. Jun 19 '17 at 23:13
  • and you're *sure* multiplying `mult` and `base` does not overflow? I thought you were getting "wrong" responses, no? --- you never show us the correspondence to A, B, C and D of your code vars, BTW, so I have no way of knowing the ranges of e.g. `base` &c. – Will Ness Jun 19 '17 at 23:15
  • @WillNess: Edited to show correspondence to A, B, C, D. and limits or variables. – Neil A. Jun 19 '17 at 23:17
  • @jschultz410: Could you please clarify what you mean? – Neil A. Jun 19 '17 at 23:19
  • One other obvious problem is your order of operations. Shouldn't you compute A * (B^C) + D? It looks like you are computing (A * B)^C + D. – jschultz410 Jun 19 '17 at 23:19
  • do you compute power in log(n) ? it's very good :) – Parham Alvani Jun 19 '17 at 23:21
  • @NeilA. I'm saying that given the limits on your operands, you should be able to get away with using basic 64b integer math and then reducing to modulo 10^12 because the 64b space is much larger than the 10^12 space. Unless, I've got my maths wrong. – jschultz410 Jun 19 '17 at 23:22
  • @jschultz410: Looking at the wikipedia implementation, instead of using a `result` variable then multiplying by `mult` I simply use `mult` as `result` with the multiplication predone as `1 * mult` – Neil A. Jun 19 '17 at 23:22
  • I don't see you are XOR-ing in your code. – too honest for this site Jun 19 '17 at 23:23
  • @Olaf: Why do I need XOR? – Neil A. Jun 19 '17 at 23:23
  • @jschultz410: (10^9) ^ (10^9) would overflow 2^64, no? – Neil A. Jun 19 '17 at 23:24
  • @NeilA.: `B ^ C`, etc. – too honest for this site Jun 19 '17 at 23:25
  • @Olaf: Ah, that was meant to be the exponent operator, I'll use `**` instead. – Neil A. Jun 19 '17 at 23:26
  • @NeilA. It definitely would, but it would also overflow 10^12 by a lot too. I might be wrong because 2^64 is not a multiple of 10, but I was thinking that 64b math would still give you the correct lower significant bits that you would then reduce down to 10^12. – jschultz410 Jun 19 '17 at 23:26
  • @jschultz410: I think because 2^64 is not a multiple of 10^12, the digits will not be the same, i.e. 233 mod 128 = 105 whereas 233 mod 100 = 33 – Neil A. Jun 19 '17 at 23:28
  • @NeilA. Yeah, I think that you are right about that. I was thrown off because you are doing modulo 2^64 math anyway when you multiply, but 10^18 still fits inside 2^64, so you aren't losing any higher order bits. I still think you have your order of operations wrong. I don't think your inner loop should involve 'mult' at all. Right? Shouldn't it only involve 'base' and 'exp' (and 'limit')? – jschultz410 Jun 19 '17 at 23:37
  • @jschultz410: `mult` is equivalent to `result` in Wikipedia's implementation, except for it is initialized to `A`. Basically, instead of starting at 1, finding `B ** C % limit` then multiplying by `A` and reducing modulo `limit`, the multiply by `A` is done first and then is subsequently reduced modulo `limit` as it is multiplied by `B ** C`. Think of it as `A * B * B * B ...` modulo `limit` instead of `B * B * B ... * A` modulo `limit` – Neil A. Jun 19 '17 at 23:41
  • this is basically a "debug me" request. can't you run some tests of your own on ideone.com or something similar? can you print your UUINT_MAX or what it's called, to be sure you're working in 64 bit? – Will Ness Jun 19 '17 at 23:48
  • 1
    @NeilA. Ok, I think I wrapped my head around that finally. Sorry I'm being so slow. So, here's a question: we are doing all this math modulo 10^12, but doesn't 10^9 * 10^12 = 10^21 math possibly overflow a 2^64 space? 10^21 = (2^3.3219...)^21 ~= 2^69.76... ??? – jschultz410 Jun 20 '17 at 00:03
  • @jschultz410 yes, `64 * log(base_10)(2) = 19.26592` i.e. `2^64 ~= 10**19.26592 ~= 1.844675 * 10^19`. – Will Ness Jun 20 '17 at 00:19
  • @NeilA. I updated my answer. I think you can definitely overflow 64b in the computation of both mult and base in the inner loop. I think you need to force 80b or 128b math instead. – jschultz410 Jun 20 '17 at 01:14
  • Well, at least I got points on hacker rank for this answer ;) – jschultz410 Jun 20 '17 at 03:01

1 Answers1

2

I think you can overflow unsigned long long math (e.g. - modulo 2^64) because your computation of base in your inner loop can get as high as (10^12 - 1)^2 ~= 10^24 ~= 2^79.726, which is much more than 2^64. For example, think about B = 10^6 - 1 and C = 4.

On my MacBook Pro running a 64b version of Mac OS X with clang 8.1.0:

#include <stdio.h>

int main()
{
  fprintf(stdout, "sizeof(unsigned long long) = %u\n", (unsigned) sizeof(unsigned long long));
  fprintf(stdout, "sizeof(__uint128_t) = %u\n", (unsigned) sizeof(__uint128_t));
  fprintf(stdout, "sizeof(long double) = %u\n", (unsigned) sizeof(long double));

  return 0;
}

Says:

sizeof(unsigned long long) = 8
sizeof(__uint128_t) = 16
sizeof(long double) = 16

If your platform says 16 or 10 instead for long long, then I think you are in the clear. If it says 8 like mine does, then you need to rework your answer to either force 128b (or 80b) integer math natively or mimic it some other way.

You can try __uint128_t, which is supported by gcc and clang. Otherwise, you'd need to resort to something like long double and fmodl(), which might have enough mantissa bits but might not give exact answers like you want.

Also, you don't accumulate multiple results like the task says. Here's my shot at it, based on your program, but using __uint128_t.

#include <stdio.h>
#include <stdlib.h>

#define BILLION     1000000000
#define TRILLION 1000000000000

int main() 
{
  const __uint128_t limit = TRILLION;
  unsigned long     cases = 0;
  __uint128_t       acc   = 0;

  if (scanf("%lu", &cases) != 1 || cases == 0 || cases > 500000)
    abort();

  while (cases-- > 0)
  {            
    unsigned long a, b, c, d;
    __uint128_t b2c = 1, bbase;

    if (scanf("%lu %lu %lu %lu", &a, &b, &c, &d) != 4 ||
        a == 0 || a > BILLION || b == 0 || b > BILLION || 
        c == 0 || c > BILLION || d == 0 || d > BILLION)
      abort();

    for (bbase = b; c > 0; c >>= 1) 
    {
      if ((c & 0x1) != 0)
        b2c = (b2c * bbase) % limit;    // 64b overflow: ~10^12 * ~10^12 ~= 10^24 > 2^64

      bbase = (bbase * bbase) % limit;  // same overflow issue as above
    }

    // can do modulus on acc only once at end of program instead because
    // 5 * 10^5 * (10^9 * (10^12 - 1) + 10^9) = 5 * 10^26 < 2^128

    acc += a * b2c + d;  
  }

  acc %= limit;

  printf("%012llu\n", (unsigned long long) acc);

  return 0;
}
jschultz410
  • 2,849
  • 14
  • 22