Sparse O(1) array with indices being consecutive products

Question

I'd like to pre-calculate an array of values of some unary function f.

I know that I'll only need the values for f(x) where x is of the form of a*b, where both a and b are integers in range 0..N.

The obvious time-optimized choice is just to make an array of size N*N and just pre-calculate just the elements which I'm going to read later. For f(a*b), I'd just check and set tab[a*b]. This is the fastest method possible - however, this is going to take a lot of space as there are lots of indices in this array (starting with N+1) which will never by touched.

Another solution is to make a simple tree map... but this slows down the lookup itself very heavily by introducing lots of branches. No.

I wonder - is there any solution to make such an array less sparse and smaller, but still quick branchless O(1) in lookup?

edit

I can hear lots of comments about a hash map... I'll proceed to benchmark how one behaves (I expect a significant performance drop over normal lookup due to branching; less than in trees, but still... let's see if I'm right!).

I'd like to emphasize: I'd mostly appreciate an analytical solution which would use some clever way (?) to take advantage of the fact that only "product-like" indices are taken. I feel that this fact might be exploited to get a way better result that an average generic hash map function, but I'm out of ideas myself.

edit

Following your advice, I've tried std::unordered_map from gcc 4.5. This was a tad slower than the simple array lookup, but indeed much faster than the tree-based std::map - ultimately I'm OK with this solution. I understand now why it's not possible to do what I originally intended to; thanks for the explanations!

I'm just unsure whether the hash-map actually saves any memory! :) As @Keith Randall has described, I cannot get the memory footprint lower than N*N/4, and the triangular matrix approach described by @Sjoerd gives me N*N/2. I think that it's entirely possible for the hash map to use more than N*N/2 space if the element size is small (depends on the container overhead) - which would make the fastest approach also the most memory-effective! I'll try to check that.

I wish I could accept 2 answers...

Of course... even a wrapper can do that. The result depends only on the product. — Kos, Jan 15 '11 at 01:47
Well, it wasn't clear if you actually knew `a` and `b` at the time of the call or whether you just had `x`. — James McNellis, Jan 15 '11 at 01:53
If you find a fast way to skip unused entries, it could be turned into a fast way to tell whether a number is prime. As the latter is considered a hard problem, I doubt you'll get an good analytical solution for your problem. — Sjoerd, Jan 16 '11 at 00:38

Sjoerd · Answer 1 · 2011-01-17T21:44:40.170

Start with looking at it as a two-dimensional array: tab[a][b]. This still requires N*N size.

Each entry will be used, but there will be duplication: f(a,b) = f(b,a). So only a triangular matrix is required (at the cost of one branch for a>b vs a<b).

if (a < b) return tab[b*(b+1) + a]; // assuming 0 <= a < b < N
else return tab[a*(a+1) + b];       // assuming 0 <= b <= a < N

Or

if (a < b) return tab[b*(b-1) + a]; // assuming 1 <= a < b <= N
else return tab[a*(a-1) + b];       // assuming 1 <= b <= a <= N

EDIT: the memory used by a triangular matrix is (N+1)*N/2, about half the size of a square matrix. Still quadratic, though :(

EDIT2: Note that er is still duplication in the matrix: e.g. f(3, 2) = f(6, 1). I don't think this can be eliminated without introducing lots of branches and loops, but that's just a gut feeling.

@Will As the question states that only the product of a and b is used in the calculation, I assumed that was the case. — Sjoerd, Jan 16 '11 at 00:14

Keith Randall · Accepted Answer · 2011-01-19T22:50:21.037

There doesn't seem to be a lot of structure to take advantage of here. If you're asking if there is a way to arrange to arrange the table such that you can avoid storage for entries that can't happen (because they have a prime factor larger than N), you can't save much. There is a theory of smooth numbers which states that the density of N-smooth numbers near N^2 is ~2^-2. So, absolute best case, you can reduce the (maximum) storage requirement by at most a factor of 4.

I think you're better off taking advantage of symmetry and then using a hash table if you expect most arguments to never occur.

score 0 · Answer 3 · answered Jan 15 '11 at 01:44

0

Why not simply hash the A and B combo and put the results in a map? And do it lazily so you just get the ones you want?

public Result f(Type1 a, Type2 b) {
    TypePair key = new TypePair(a, b);
    Result res = map.get(key);
    if (res == null) {
        res = reallyCalculate(a, b);
        map.put(key, res);
    }
    return res;
}

Basic memoization.

answered Jan 15 '11 at 01:44

Will Hartung

115,893
19
128
203

1

This looks an awful lot like Not C++. – James McNellis Jan 15 '11 at 01:47
Due to the distribution of `(a,b)`, there will be an awful lots of collisions. And each collision is another branch. Not to mention the memomry overhead for the collision lists. For small N, a simple N*N array will require less memory and be faster. Maybe the actual N will be in that range? – Sjoerd Jan 15 '11 at 01:58
Depends on the hash function and the size of the underlying array. If the array is large enough and the hash is good, there is no benefit to using an n x n array - especially if it is sparse. – Andrei Krotkov Jan 15 '11 at 02:47
@Andrei You rely on "if [...] the hash is good." That's a big assumption. – Sjoerd Jan 16 '11 at 00:12
Well, the programmer has control over the hash function, so it's up to them. – Andrei Krotkov Jan 20 '11 at 22:19

score 0 · Answer 4 · answered Jan 15 '11 at 01:45

Hash tables provide a good balance between lookup speed and memory overhead. The C++ standard library does not provide a hash table, although it is sometimes available as a non-standard extension. See the SGI hash_map for example.

The Poco C++ library also has a HashTable and HashMap classes, see the documentation.

Sparse O(1) array with indices being consecutive products

4 Answers4

Linked