I think this is technically wheel factorization. I'm trying to re-compress my program's representation of the Sieve of Eratosthenes, which only contains indexes of numbers which are possibly prime.
Some background:
The most basic wheel is [2]: keep track of 2 as the first prime, and the sieve only contains odd indexes. (50%))
The next wheel is [2 3]: keep track of two and three as the first primes, and the sieve only contains the gaps between 2*3=6 (i.e., 1 and 5). Indexes are of the form 6k+1 and 6k+5. (33%)
The next wheel is [2 3 5]: keep track of 2, 3 and 5 as the first primes, and the sieve only needs 8 bits to represent intervals of size 30. (27%)
When clearing the bits for multiples of a number, I find those multiples using this loop:
def multiplesIndices (self, i):
for gap in self.gaps[1:]:
ret = i * (self.product * 0 + gap)
if ret > len (self): break
yield ret
for k in xrange (1, len (self) / i / self.product + 1):
for gap in self.gaps:
ret = i * (self.product * k + gap)
if ret > len (self): break
yield ret
The problem is the time involved to setup a wheel, combined with the diminishing returns on the compression ratio. Well, that and rn changing to a different wheel size involves a lot of recomputation. Also, by varying the wheel size, I think I can affect the asymptotic complexity.
So my proposed solution is to use small wheels to initialize larger wheels: [2 3] (6/2) to get the gaps in the [2 3 5] (30/8) wheel [2 3 5] to get the gaps in the [2 3 5 7] (210/48) wheel
Where I need help is mapping the already-computed small sieve to the to-be-computed bigger sieve, so I can avoid re-sieving everything from 1. Get the first 30 primes, use them to find the next 210-30 primes, use them to find the next 480-210 primes.
More specifically, I need help inverting this function (or correctly implementing invIndexOf()):
def indexOf (self, n):
if not self.isValidIndex (n): raise Exception ()
ret = n / self.product * len (self.gaps) + self.gaps.index (n % self.product)
assert n in self.invIndexOf (ret)
return ret
Also, it's been a few years since I've figured the asymptotic complexity of anything. I'm pretty sure this is an improvement, though not a drastic one.