4

I am trying to find the sum of the first r binomial coefficients for a fixed n.

(nC1 + nC2 + nC3 + ... + nCr) % M

where r < = n.

Is there an efficient algorithm to solve this problem ?

ROMANIA_engineer
  • 54,432
  • 29
  • 203
  • 199
Rohit Sharma
  • 73
  • 1
  • 7

2 Answers2

4

My first answer was unsatisfactory for several reasons, one of which being that the paper which I referenced is difficult to understand and to implement. So I'm going to propose a different solution to the problem below.

We want to calculate the sum of the first r binomial coefficients for fixed n, nC0 + nC1 + ... + nC(r-1), modulo M. Instead of reducing the computation of nCk by reducing n, it makes more sense to reduce k: we need nC(k-1) already as part of the sum; in addition, we may have r much less than n, so getting at the values by incrementing n could be far less efficient than incrementing r.

Here's the idea: First note that if r > n/2 we have nC0 + ... + nC(r-1) = 2^n - (nCr + ... + nCn) = 2^n - (nC0 + ... + nC(n-r)) where n-r < n/2, so we have reduced the problem to the case where r <= n/2.

Next, apply the identity

nCk = n!/(k!(n-k)!) = n!/((k-1)!(n-(k-1)!) x (n-k+1)/k = nC(k-1) x (n-k+1)/k

to calculate the terms of the sum in order. If out integers were unbounded in size, we could calculate

sum = 0;
nCi = 1; // i=0
for i = 1 to r-1
  sum += nCi;
  nCi *= (n-k+1);
  nCi /= k;
sum %= M;

The problem with this is that numbers nCi (and therefore sum) can become enormous, so we have to use big integers, which slow down the calculation. However, we only need the result mod M, so we can use ints if we perform calculations mod M inside the loop.

Sum and product are straightforward mod M, but division isn't. To divide nCi by k mod 10^6, we need to write nCi and k in the form 2^s 5^t u where u is relatively prime to 10^6. Then we subtract exponents, and multiply by the inverse of u mod 10^6. In order to write nCi in that form, we also need to write n-k+1 in that form.

To put k and n-k+1 into the form 2^s 5^t u where u is relatively prime to 10^6, we could repeatedly check for divisibility by then divide by 2, and the same for 5. However, it seems there should be a faster way.

In any case, the algorithm is now O(r), which seems to be the fastest possible, barring the discovery for a simple mathematical expression for the sum.

Edward Doolittle
  • 4,002
  • 2
  • 14
  • 27
  • Re: "In any case, the algorithm is now O(r)": Can you elaborate on that? The calculations that you describe don't seem obviously O(1) to me. – ruakh Dec 23 '19 at 00:10
3

Note that the "first" binomial coefficient for fixed n is nC0. Let f(n) = nC0 + nC1 + ... + nC(r-1). Using the "Pascal's triangle" identity, nCk = (n-1)C(k-1) + (n-1)Ck we have

    nC0 + nC1 + nC2 + ... + nC(r-1)
    = (n-1)C(-1) + (n-1)C0 + (n-1)C0 + (n-1)C1 + (n-1)C1 + (n-1)C2 + ... + (n-1)C(r-2) + (n-1)C(r-1) 
    = 2[(n-1)C0 + (n-1)C1 + (n-1)C2 + ... + (n-1)C(r-2)] + (n-1)C(r-1)
    = 2[(n-1)C0 + ... + (n-1)C(r-1)] - (n-1)C(r-1),
    
i.e., f(n) = 2f(n-1) - (n-1)C(r-1) so each of the sums can be computed from the previous by doubling the previous and subtracting (n-1)C(r-1).

For example, if r=3, then

    f(0) = 1, 
    f(1) = 1 + 1      =  2 = 2f(0) - 0C2, 
    f(2) = 1 + 2 +  1 =  4 = 2f(1) - 1C2,
    f(3) = 1 + 3 +  3 =  7 = 2f(2) - 2C2,
    f(4) = 1 + 4 +  6 = 11 = 2f(3) - 3C2,
    f(5) = 1 + 5 + 10 = 16 = 2f(4) - 4C2,
    
and so on.

To perform the calculations mod m, you would need to pre-calculate the binomial coefficients (n-1)C(r-1) mod m. If m is prime, the binomial coefficients mod m are cyclic with cycle m^k (the power of m greater than r-1). If m is a power of a prime, the results are rather more complicated. (See http://www.dms.umontreal.ca/~andrew/PDF/BinCoeff.pdf.) If m has more than one prime factor, the calculations can be reduced to the previous cases using the Chinese Remainder Theorem.

Edward Doolittle
  • 4,002
  • 2
  • 14
  • 27
  • Why would (n+m) choose (r-1) equal n choose (r-1) mod m? For example, 11 choose 11 is 1, but 21 choose 11 = 352716, and mod 10 these are not equal. – Douglas Zare Apr 03 '15 at 23:19
  • Of course, you're right. If m is prime, then by Lucas's theorem the cycle length is m^k for some k, but for m composite the situation is much more complicated. – Edward Doolittle Apr 04 '15 at 21:44
  • I like the last paragraph in this answer. If you are calculating the binomial coefficients anyway, why not just calculate `nC0`, `nC1`, `nC2`, ..., `nCr` mod `M` and sum them mod `M`? Why go through the trouble of defining and using `f(n)`? – Matt Jun 20 '15 at 20:09
  • Yes, I agree completely, and the same thought occurred to me. However the paper to which I referred is not entirely straightforward to translate to code, so I've been trying to think of a way to solve the problem that doesn't use the result of the paper. I think I have a solution: it's better to come from the edge instead of the top of the triangle. I'll submit another answer shortly. – Edward Doolittle Jun 22 '15 at 06:10