Avoiding overflow working modulo p

Question

As part of a university assignment, I have to implement in C scalar multiplication on an elliptic curve modulo p = 2^255 - 19. Since all computations are made modulo p, it seems enough to work with the primitive type (unsigned long).

However, if a and b are two integers modulo p, there is a risk of overflow computing a*b. I am not sure how to avoid that. Is the following code correct ?

long a = ...;
long b = ...;
long c = (a * b) % p;

Or should I rather cast a and b first ?

long a = ...;
long b = ...;
long long a1 = (long long) a;
long long b1 = (long long) b;
long c = (long) ((a1 * b1) % p);

I was also thinking or working with long long all along.

There is actually no guarantee that `long long` is larger than `long` - the standard only dictates minimum sizes of those types — UnholySheep, Feb 10 '18 at 13:36
p is 2^255 - 19, so I think that taking the modulo before the multiplication is not enough — Raf, Feb 10 '18 at 13:38
Your `p` has 255 binary digits so all the native types are too narrow. — molbdnilo, Feb 10 '18 at 13:40
@Raf yup - you are going to need to use/write a multi-precision library. Still that relationship will keep your numbers to max 512 bits long. — Richard Critten, Feb 10 '18 at 13:41
@molbdnilo I thought that long was 32 bytes minimum, it makes 256 bits, doesn't it ? — Raf, Feb 10 '18 at 13:42
@Raf fundamental types: http://en.cppreference.com/w/cpp/language/types `long` is min 32 __bits__ — Richard Critten, Feb 10 '18 at 13:44
@Raf They're usually 32 *bits* (four eight-bit bytes), unless you have access to some seriously exotic hardware. — molbdnilo, Feb 10 '18 at 13:44
@Raf the standard(s) dictate the size of types in *bits* because a byte does not even have to be 8 bits. — UnholySheep, Feb 10 '18 at 13:45
Ok I misread the documentation, thanks ! However, I think my question is valid in the case where `p` is just smaller than the type limit — Raf, Feb 10 '18 at 13:46
@Raf.: My answer revolves on what you said in last comment. If it is wthin limits how things would work. — user2736738, Feb 10 '18 at 13:55
If you have 2 numbers that say fit into 32 bits each and you multiply them together you will needs at most 64 bits for the result (double the original size). You can the use the modulus relationship above to stop the uncontrolled growth of the numbers. Re you question: 2^255 - 19 fits in 256 bits so you need to do multi-precision maths with a type supporting 512 bits. There are libraries to do this or you can write one using an array to hold your number. — Richard Critten, Feb 10 '18 at 14:08
@raf [Modular exponentiation without range restriction](https://codereview.stackexchange.com/questions/187257/modular-exponentiation-without-range-restriction) may be of value if the native types are sufficient. Although I doubt even `uintmax_t` will be 256 bits on your platform. Look to its `powmodmax_high()` to see how you might code your own extend math solution. — chux - Reinstate Monica, Feb 10 '18 at 15:07
Errr you need to learn arithmetic (mod p) [2^255 - 19] (mod n) is actually quite easy to do. You can use Primitive root modulo n to reduce the problem. — Ahmed Masud, Feb 10 '18 at 15:20
@Biffen I use C++ for the program that led me to ask this. From your question, I understand that I should use it only for specific C++ functionalities, tell me if I am wrong. — Raf, Feb 10 '18 at 21:43
@Raf C and C++ are two different languages. Which one is this question about? — Biffen, Feb 11 '18 at 07:01

user2736738 · Answer 1 · 2018-02-10T14:21:14.463

The whole operation(multiplication) is being done keeping in mind the type of the operands. You multiplied two long variables and the result if greater than what long variable can hold, it will overflow.

((a%p)*(b%p))%p this gives one protection that it wraps around p but what is being said in earlier case would still hold - (a%p)*(b%p) still can overflow. (considering that a,b is of type long).

If you store the values of long in long long no need to cast. But yes the result will now overflow when the multiplication yields the value greater than what long long can hold.

To give you a clarification:-

long a,b;
..
long long p = (a*b)%m;

This won't help. The multiplication when done is long arithmetic. Doesn't matter where we store the end result. It depends on the type of the operands.

Now look at this

long c = (long) ((a1 * b1) % p); here the result will be two long long multiplication and will overflow based on max value long long can hold but still there is a chance of overflow when you assign it to long.

If p is 255 byte you can't realize what you want using built in types long or long long types using 32 or 64 bit system. Down the line when we have 512 bit system this would surely be possible. Also one thing to note is when p=2²⁵⁵-19 then there is hardly any practicality involved in doing modular arithmetic with it.

If sizeof long is equal to sizeof long long as in ILP64 and LP64 then using long and long long would give you no result as such. But if sizeof long long is greater than sizeof long it is useful in saving the operands in long long to prevent overflow of the multiplication.

Also another way around is to write your own big integer library(multiple precision integer library) or use one which is already there(maybe like this). The idea revolves around the fact that the larger types are realized using something as simple as char and then doing operation on it. This is an implementation issue and there are many implementation around this same theme.

@coderredoc That was an attempt to thank you for your answer ;) — Raf, Feb 10 '18 at 14:00

score 0 · Answer 2 · answered Feb 10 '18 at 15:22

With a 255+ bit integer requirement, standard operations and the C library are insufficient.

Follows in the general algorithm to write your own modular multiplication.

myint mod(myint a, myint m);
myint add(myint a, myint b);  // this may overflow
int   cmp(myint a, myint b);
int   isodd(myint a);
myint halve(myint a);

// (a+b)%mod
myint addmodmax(myint a, myint b, myint m) {
  myint sum = add(a,b);
  if (cmp(sum,a) < 0) {
    sum = add(mod(add(sum, 1),m), mod(myint_MAX,m)); // These additions do not overflow
  }
  return mod(sum, m);
}

// (a*b)%mod
myint mulmodmax(myint a, myint b, myint m) {
  myint prod = 0;
  while (cmp(b,0) > 0) {
    if (isodd(b)) {
      prod = addmodmax(prod, a, m);
    }
    b = halve(b);
    a = addmodmax(a, a, m);
  }
  return prod;
}

score 0 · Answer 3 · answered Jan 29 '19 at 19:56

I recently came to this same problem.

First of all I'm going to assume you mean 32-bit integers (after reading your comments), but I think this applies to Big Integers as well (because doing a naive multiplication means doubling the word size and is going to be slow as well).

Option 1

We use the following property:

Proposition. a*b mod m = (a - m)*(b - m) mod m
Proof.

(a - m)*(b - m) mod m =  
(a*b - (a+b)*m + m^2) mod m =  
(a*b mod m - ((a+b) + m)*m mod m) mod m =  
(a*b mod m) mod m = a*b mod m

q.e.d.

Moreover, if a,b approx m, then (a - m)*(b - m) mod m = (a - m)*(b - m). You will need to address the case for when a,b > m, however I think the validity of (m - a)*(m - b) mod m = a*b mod m is a corollary of the above Proposition; and of course don't do this when the difference is very big (small modulus, big a or b; or vice versa) or it will overflow.

Option 2

From Wikipedia

uint64_t mul_mod(uint64_t a, uint64_t b, uint64_t m)
{
   uint64_t d = 0, mp2 = m >> 1;
   int i;
   if (a >= m) a %= m;
   if (b >= m) b %= m;
   for (i = 0; i < 64; ++i)
   {
       d = (d > mp2) ? (d << 1) - m : d << 1;
       if (a & 0x8000000000000000ULL)
           d += b;
       if (d >= m) d -= m;
       a <<= 1;
   }
   return d;
}

And also, assuming long double and 32 or 64 bit integers (not arbitrary precision) you can exploit the machine priority on most significant bits of different types:

On computer architectures where an extended precision format with at least 64 bits of mantissa is available (such as the long double type of most x86 C compilers), the following routine is faster than any algorithmic solution, by employing the trick that, by hardware, floating-point multiplication results in the most significant bits of the product kept, while integer multiplication results in the least significant bits kept

And do:

uint64_t mul_mod(uint64_t a, uint64_t b, uint64_t m)
{
   long double x;
   uint64_t c;
   int64_t r;
   if (a >= m) a %= m;
   if (b >= m) b %= m;
   x = a;
   c = x * b / m;
   r = (int64_t)(a * b - c * m) % (int64_t)m;
   return r < 0 ? r + m : r;
}

These are guaranteed to not overflow.

Avoiding overflow working modulo p

3 Answers3