Program to find all primes in a very large given range of integers

Question

i came across this following question on a programming website : Peter wants to generate some prime numbers for his cryptosystem. Help him! Your task is to generate all prime numbers between two given numbers!

Input

The input begins with the number t of test cases in a single line (t<=10). In each of the next t lines there are two numbers m and n (1 <= m <= n <= 1000000000, n-m<=100000) separated by a space.

I came up with the following solution :

import java.util.*;

public class PRIME1 {
    static int numCases;
    static int left, right;
    static boolean[] initSieve = new boolean[32000];
    static boolean[] answer;

    public static void main(String[] args) {
        Scanner sc = new Scanner(System.in);
        numCases = sc.nextInt();
        initSieve[0] = true;
        initSieve[1] = true;
        Sieve();
        for (int j = 0; j < numCases; j++) {
            String line = sc.next();
            String line2 = sc.next();
            left = Integer.parseInt(line);
            right = Integer.parseInt(line2);
            answer = new boolean[right - left + 1];
            getAnswer();
            for (int i = 0; i < answer.length; i++) {
                if (!answer[i]) {
                    int ans = i + left;
                    System.out.println(ans);
                }
            }
            System.out.println();
        }
    }

    public static void Sieve() {

        for (int i = 2; i < 32000; i++) {
            if (!initSieve[i]) {
                for (int j = 2 * i; j < 32000; j += i) {
                    initSieve[j] = true;
                }
            }
            if (i * i > 32000)
                break;
        }
    }

    public static void getAnswer() {
        for (int i = 2; i < 32000 && i <= right; i++) {
            if (!initSieve[i]) {
                int num = i;
                if (num * 2 >= left) {
                    num *= 2;
                } else {
                    num = (num * (left / num));
                    if (num < left)
                        num += i;
                }
                for (int j = num; j >= left && j <= right; j += i) {
                    answer[j - left] = true;
                }
            }
        }
    }
}

I have edited my solution after reading some of the suggestions. I am still getting a time limit exceeded kind of error. Any more suggestions as how to further optimize this ? Am calculating all the primes upto 32000 and then using these to find the primes between n to m.

Thanks, Rohit

If you start at 3, not at 2 in the sieve, and manage to swap the value for 2 outside the loop, you can iterate by i+=2;. Then, instead of running to isNotPrime.length in the outer loop, `√(isNotPrime.length)` should be enough. Unrelated: Scanner has a method nextInt. — user unknown, May 22 '12 at 14:48
You can half your run time instantly by looping over just odds in the range, starting from an odd number, and incrementing by `i+=2` in `for (int i = 0; i < answer.length; i+=2)`. Make sure that `i` corresponds to an odd number not below your `left`. No even number above 2 is ever going to be a prime. :) That woud be sparse addressing scheme, even faster is to work with compressed array, where entry at index `i` represents number `n=left_odd + 2i`. Inside `Sieve()`, work by `j+=2*i` too (though this sieve is very small) but more importantly, inside `getAnswer()`. Look at Daniel's answer. — Will Ness, May 23 '12 at 10:19

Daniel Fischer · Accepted Answer · 2012-05-22T15:53:49.143

You are given

1 <= m <= n <= 1000000000, n-m<=100000

these are very small numbers. To sieve a range with an upper bound of n, you need the primes to √n. Here you know n <= 10^9, so √n < 31623, so you need at worst the primes to 31621. There are 3401. You can generate them with a standard sieve in a few microseconds.

Then you can simply sieve the small range from m to n by marking the multiples of the primes you've sieved before, stopping when the prime exceeds √n. Some speedup can be gained by eliminating the multiples of some small primes from the sieve, but the logic becomes more complicated (you need to treat sieves with small m specially).

public int[] chunk(int m, int n) {
    if (n < 2) return null;
    if (m < 2) m = 2;
    if (n < m) throw new IllegalArgumentException("Borked");
    int root = (int)Math.sqrt((double)n);
    boolean[] sieve = new boolean[n-m+1];
    // primes is the global array of primes to 31621 populated earlier
    // primeCount is the number of primes stored in primes, i.e. 3401
    // We ignore even numbers, but keep them in the sieve to avoid index arithmetic.
    // It would be very simple to omit them, though.
    for(int i = 1, p = primes[1]; i < primeCount; ++i) {
        if ((p = primes[i]) > root) break;
        int mult;
        if (p*p < m) {
            mult = (m-1)/p+1;
            if (mult % 2 == 0) ++mult;
            mult = p*mult;
        } else {
            mult = p*p;
        }
        for(; mult <= n; mult += 2*p) {
            sieve[mult-m] = true;
        }
    }
    int count = m == 2 ? 1 : 0;
    for(int i = 1 - m%2; i < n-m; i += 2) {
        if (!sieve[i]) ++count;
    }
    int sievedPrimes[] = new int[count];
    int pi = 0;
    if (m == 2) {
        sievedPrimes[0] = 2;
        pi = 1;
    }
    for(int i = 1 - m%2; i < n-m; i += 2) {
        if (!sieve[i]) {
            sievedPrimes[pi++] = m+i;
        }
    }
    return sievedPrimes;
}

Using a BitSet or any other type of packed flag-array would reduce the memory usage and thus may give a significant speed-up due to better cache-locality.

I have edited my solution according to your suggestion, but am still getting a time limit exceeded situation. Could you please suggest as to how to further optimize? — Rohit Agrawal, May 22 '12 at 18:49
What's the time limit, and what is the task? If you should print out all the primes, that is probably the bottleneck. I'm not well-versed in Java's I/O, so I couldn't offer advice how to speed up the printing. If you should only print the number (or sum) of primes in each interval, that should hardly be a problem. This implementation does the setup and sieves 100 intervals of approximately 100000 length in under 90ms here, the startup time of the JVM is higher. Of course, the testing machines may be slower and have smaller caches, so replacing the `boolean[]` with `BitSet`s may be good. — Daniel Fischer, May 22 '12 at 19:06

user unknown · Answer 2 · 2012-05-22T15:06:20.443

1

Use a BitSet instead of an Array of Boolean.

public static BitSet primes (final int MAX)
{
     BitSet primes = new BitSet (MAX);
     // make only odd numbers candidates...
     for (int i = 3; i < MAX; i+=2)
     {
        primes.set(i);
     }
     // ... except no. 2
     primes.set (2, true);
     for (int i = 3; i < MAX; i+=2)
     {
        /*
            If a number z is already  eliminated (like 9),
             because it is itself a multiple of a prime 
            (example: 3), then all multiples of z (9) are
            already eliminated.
        */
        if (primes.get (i))
        {
            int j = 3 * i;
            while (j < MAX)
            {
                if (primes.get (j))
                    primes.set (j, false);
                j += (2 * i);
            }
        }
    }
    return primes;
}

edited May 22 '12 at 15:06

answered May 22 '12 at 14:09

user unknown

35,537
11
75
121

1000000000/32 integer element array.. wont that still be a lot of space? Is there any way we can exploit the constraint that n-m <=100000 in this program? Thanks! – Rohit Agrawal May 22 '12 at 14:14
It's 1/32 GB, isn't it? Price about: half a buck. I would tend to see a problem in the time, needed to fill this bitset. However, I don't know how to use Eulers totient to solve the problem and don't see a help in the 100 000 interval. I don't expect that the probablePrime method is allowed. – user unknown May 22 '12 at 14:26

score 0 · Answer 3 · answered May 22 '12 at 14:19

0

do you HAVE to store the result in the array? how about a method that computes if a given integer is a prime or not and just call it for each number in {left,left+1,...,right} ?

answered May 22 '12 at 14:19

snajahi

900
2
14
26

i haven't initialized the isNotPrime table as the values will default to false which I intend to have. Also, for the method you suggest would be very expensive in terms of time and that is why I didn't use that. – Rohit Agrawal May 22 '12 at 14:23
testing each candidate number in separation for primality (by trial division ?) is much slower than using offset sieve of Eratosthenes to mark the composites off on the whole range at once (for each prime below `sqrt` of upper limit of course). – Will Ness May 23 '12 at 10:09

score 0 · Answer 4 · answered May 22 '12 at 14:22

0

You can always use an offset when accessing the isNotPrime array.

Given m, n:

boolean[] isNotPrime = new boolean[n-m+1];

// to now if number x is primer or not
boolean xIsPrime = isNotPrime[x-m];

Here m is the offset.

answered May 22 '12 at 14:22

Guillaume Polet

47,259
4
83
117

score 0 · Answer 5 · answered May 22 '12 at 14:36

You are not forced to have one large array : you can keep a list of the primes found so far, and test using multiple arrays, having values = array_slot + offset(already tested values). Once you have finished the values from i to j, you add j-i to offset and start a new array starting from J.

You can remove even numbers from your array, that would save you some space (values = array_slot * 2 - 1).

score 0 · Answer 6 · answered May 22 '12 at 18:48

Since the distance between m and n is relatively small, you can brute-force and use a fast primality test algorithm in every number between m and n.

If you allow probabilistic algorithms, you can use Miller-Rabin test. Let M = n-m <= 10^5 and N = n <= 10^9. The complexity of the brute-force algorithm would be O(k M (log N)^3), where k is a constant that controls probabilistic guarantees (for practical applications, k can be set to 10).

For the limits of the problem, this complexity will be around 10^9.

Program to find all primes in a very large given range of integers

6 Answers6

Linked