1

I'd like to pre-calculate an array of values of some unary function f.

I know that I'll only need the values for f(x) where x is of the form of a*b, where both a and b are integers in range 0..N.

The obvious time-optimized choice is just to make an array of size N*N and just pre-calculate just the elements which I'm going to read later. For f(a*b), I'd just check and set tab[a*b]. This is the fastest method possible - however, this is going to take a lot of space as there are lots of indices in this array (starting with N+1) which will never by touched.

Another solution is to make a simple tree map... but this slows down the lookup itself very heavily by introducing lots of branches. No.

I wonder - is there any solution to make such an array less sparse and smaller, but still quick branchless O(1) in lookup?

edit

I can hear lots of comments about a hash map... I'll proceed to benchmark how one behaves (I expect a significant performance drop over normal lookup due to branching; less than in trees, but still... let's see if I'm right!).

I'd like to emphasize: I'd mostly appreciate an analytical solution which would use some clever way (?) to take advantage of the fact that only "product-like" indices are taken. I feel that this fact might be exploited to get a way better result that an average generic hash map function, but I'm out of ideas myself.

edit

Following your advice, I've tried std::unordered_map from gcc 4.5. This was a tad slower than the simple array lookup, but indeed much faster than the tree-based std::map - ultimately I'm OK with this solution. I understand now why it's not possible to do what I originally intended to; thanks for the explanations!

I'm just unsure whether the hash-map actually saves any memory! :) As @Keith Randall has described, I cannot get the memory footprint lower than N*N/4, and the triangular matrix approach described by @Sjoerd gives me N*N/2. I think that it's entirely possible for the hash map to use more than N*N/2 space if the element size is small (depends on the container overhead) - which would make the fastest approach also the most memory-effective! I'll try to check that.

I wish I could accept 2 answers...

Kos
  • 70,399
  • 25
  • 169
  • 233
  • Can you change `f` to be `f(a, b)`? – James McNellis Jan 15 '11 at 01:43
  • Of course... even a wrapper can do that. The result depends only on the product. – Kos Jan 15 '11 at 01:47
  • Well, it wasn't clear if you actually knew `a` and `b` at the time of the call or whether you just had `x`. – James McNellis Jan 15 '11 at 01:53
  • 1
    If you find a fast way to skip unused entries, it could be turned into a fast way to tell whether a number is prime. As the latter is considered a hard problem, I doubt you'll get an good analytical solution for your problem. – Sjoerd Jan 16 '11 at 00:38

4 Answers4

5

Start with looking at it as a two-dimensional array: tab[a][b]. This still requires N*N size.

Each entry will be used, but there will be duplication: f(a,b) = f(b,a). So only a triangular matrix is required (at the cost of one branch for a>b vs a<b).

if (a < b) return tab[b*(b+1) + a]; // assuming 0 <= a < b < N
else return tab[a*(a+1) + b];       // assuming 0 <= b <= a < N

Or

if (a < b) return tab[b*(b-1) + a]; // assuming 1 <= a < b <= N
else return tab[a*(a-1) + b];       // assuming 1 <= b <= a <= N

EDIT: the memory used by a triangular matrix is (N+1)*N/2, about half the size of a square matrix. Still quadratic, though :(

EDIT2: Note that er is still duplication in the matrix: e.g. f(3, 2) = f(6, 1). I don't think this can be eliminated without introducing lots of branches and loops, but that's just a gut feeling.

Sjoerd
  • 6,837
  • 31
  • 44
2

There doesn't seem to be a lot of structure to take advantage of here. If you're asking if there is a way to arrange to arrange the table such that you can avoid storage for entries that can't happen (because they have a prime factor larger than N), you can't save much. There is a theory of smooth numbers which states that the density of N-smooth numbers near N^2 is ~2^-2. So, absolute best case, you can reduce the (maximum) storage requirement by at most a factor of 4.

I think you're better off taking advantage of symmetry and then using a hash table if you expect most arguments to never occur.

Keith Randall
  • 22,985
  • 2
  • 35
  • 54
0

Why not simply hash the A and B combo and put the results in a map? And do it lazily so you just get the ones you want?

public Result f(Type1 a, Type2 b) {
    TypePair key = new TypePair(a, b);
    Result res = map.get(key);
    if (res == null) {
        res = reallyCalculate(a, b);
        map.put(key, res);
    }
    return res;
}

Basic memoization.

Will Hartung
  • 115,893
  • 19
  • 128
  • 203
  • 1
    This looks an awful lot like Not C++. – James McNellis Jan 15 '11 at 01:47
  • Due to the distribution of `(a,b)`, there will be an awful lots of collisions. And each collision is another branch. Not to mention the memomry overhead for the collision lists. For small N, a simple N*N array will require less memory and be faster. Maybe the actual N will be in that range? – Sjoerd Jan 15 '11 at 01:58
  • Depends on the hash function and the size of the underlying array. If the array is large enough and the hash is good, there is no benefit to using an n x n array - especially if it is sparse. – Andrei Krotkov Jan 15 '11 at 02:47
  • @Andrei You rely on "if [...] the hash is good." That's a big assumption. – Sjoerd Jan 16 '11 at 00:12
  • Well, the programmer has control over the hash function, so it's up to them. – Andrei Krotkov Jan 20 '11 at 22:19
0

Hash tables provide a good balance between lookup speed and memory overhead. The C++ standard library does not provide a hash table, although it is sometimes available as a non-standard extension. See the SGI hash_map for example.

The Poco C++ library also has a HashTable and HashMap classes, see the documentation.

StackedCrooked
  • 34,653
  • 44
  • 154
  • 278