I have already implemented Sieve of Eratosthenes and another version with Wheel factorization, both in NumPy. But I don't think I have done it right, because the version with wheel factorization is actually slower.
I have already made several optimizations to the sieve. First all even numbers except 2 are composites (multiples of 2), so I only check odd numbers, the number of candidates is halved. Second all prime numbers except 2 and 3 are of the form 6k+1 or 6k-1, because if the modulo 6 remainder is 3 then the number must be a multiple of 3.
So starting from 5, only numbers with remainder of 1 or 5 need to be checked, the increment between each iteration is 6, thus the number of candidates is only a sixth of the original.
Third the start of the multiple for each prime number found is its square, this ensures this multiple cannot be previously flipped, and the step of the multiple increment is 2*n, because even multiples are already eliminated, so only odd multiples remain.
This is the version without Wheel factorization:
import numpy as np
def prime_sieve(n: int) -> np.ndarray:
primes = np.ones(n + 1, dtype=bool)
primes[:2] = primes[4::2] = primes[9::6] = False
for a in range(6, int(n**0.5) + 1, 6):
for b in (-1, 1):
i = a + b
if primes[i]:
primes[i * i :: 2 * i] = False
return np.where(primes)[0]
It is highly inefficient, because it does way too many flipping operations, the same number can be flipped multiple times, and the flipping doesn't skip all the numbers that aren't of the forms 6k+1 and 6k-1 that are already eliminated, thus a lot of computational redundancy.
I have implemented another version with Wheel factorization, using information found on Wikipedia (I wrote the code completely by myself, there isn't pseudocode for Wheel sieve):
import numpy as np
from itertools import cycle
def prime_wheel_sieve(n: int) -> np.ndarray:
wheel = cycle([4, 2, 4, 2, 4, 6, 2, 6])
primes = np.ones(n + 1, dtype=bool)
primes[:2] = primes[4::2] = False
primes[9::6] = primes[15::10] = False
k = 7
while (square := k * k) <= n:
if primes[k]:
primes[square::2*k] = False
k += next(wheel)
return np.where(primes)[0]
But it suffers the same drawbacks as the one without Wheel, and is less performant:
In [412]: %timeit prime_sieve(65536)
226 µs ± 2.97 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
In [413]: %timeit prime_wheel_sieve(65536)
236 µs ± 24 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
How can I properly implement Sieve of Eratosthenes with Wheel factorization in NumPy?