17

I found an article that contains this code:

template <typename ReturnType, typename... Args>
std::function<ReturnType (Args...)>
memoize(std::function<ReturnType (Args...)> func)
{
    std::map<std::tuple<Args...>, ReturnType> cache;
    return ([=](Args... args) mutable {
            std::tuple<Args...> t(args...);
            if (cache.find(t) == cache.end())                
                cache[t] = func(args...);
            return cache[t];
    });
}

Can you explain this please? I can't understand many things here, but the weirdest thing is that cache is local and not static, but maybe I'm wrong and...

Jarod42
  • 203,559
  • 14
  • 181
  • 302
Mircea Ispas
  • 20,260
  • 32
  • 123
  • 211

5 Answers5

26

This is simple C++1x implementation of memoization.

The memoize function returns a closure. The return value is a function that has state other than what is passed through the arguments (in this case, the cache variable).

The [=] bit in the anonymous function indicates that the returned function should take a copy of all local variables. The cache variable is not static because it is meant to be shared across invocations of the returned function.

Thus, each call to memoize will return a different function with it's own cache. Subsequent calls to a specific closure returned by memoize will insert/fetch values from that closure's cache.

You can think of this as a somewhat equivalent to the more old-school OOP version:

template <typename ReturnType, typename... Args>
class Memoize
{
    std::map<std::tuple<Args...>, ReturnType> cache;
public:
    ReturnType operator() (Args... args)
    {
        std::tuple<Args...> t(args...);
        if (cache.find(t) == cache.end())                
            cache[t] = func(args...);
        return cache[t];
    }
};
André Caron
  • 44,541
  • 12
  • 67
  • 125
  • Still, why is this (cache closure) better than declaring cache as static inside the returned lambda? – Nordlöw Sep 23 '11 at 22:18
  • Further, how do we make the returned memoized function thread-safe? Closure cache nor static cache doesn't fix that right? – Nordlöw Sep 23 '11 at 22:21
  • 2
    @Nordlöw: The style used in the question is the usual idiom for stateful closures in all programming languages I know. Sticking to well-known usage is a Good Thing (tm). To make the closure thread safe, you can use a reader-write lock (`boost::shared_mutex`) which can be declared in the same scope as the cache. Lock the map as necessary inside the lambda. – André Caron Sep 23 '11 at 22:37
  • Would using *Intel TBB*'s `tbb::concurrent_unordered_map` over `boost::shared_mutex` be an overkill? – Nordlöw Sep 24 '11 at 12:58
  • How should we lock the `boost::shared_mutex`, using `lock()` or `shared_lock()`? What's the difference? This is a multiple-writer-multiple-reader case, right? – Nordlöw Sep 24 '11 at 13:38
  • 1
    @Nordlöw: As for intel TTBs, you should ask a separate question. As for boost `shared_mutex`, read the documentation. A quick remark would be to lock the cache for read access (shared lock) on lookup and write access (exclusive lock) on insert. – André Caron Sep 24 '11 at 14:52
  • @AndréCaron added some missing part to the code to be compiling. Your code is much simpler I agree. What is missing though is the fact that memoization is not fixed for a given function func. You can mix multiple functions and get mixed values in case you call it with multiple functions – Ghita Jan 05 '13 at 18:57
  • @Ghita: the code I put is a direct adaptation of the code provided in the question, which is fixed to a single function. The implementation is meant to show how to write memoization in a way familiar to OOP programmers, and not how to implement each and every variant of memoization. – André Caron Jan 07 '13 at 15:34
9

The cache is embedded into the lambda itself, and local to it.

Therefore, if you create two lambdas, each will have a cache of its own.

It's a great way to implement a simple cache, since this way the memory used is purged as soon as the lambda goes out of scope, and you don't have an explosion of memory.

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
3

"This simple piece of code" can memoize recursive functions too, provided it is properly invoked. Here I give a complete example:

#include <functional>
#include <iostream>
#include <tuple>
#include <map>

template <typename ReturnType, typename... Args>
std::function<ReturnType (Args...)> memoize(std::function<ReturnType (Args...)> func) {
  std::map<std::tuple<Args...>, ReturnType> cache;
  return ([=](Args... args) mutable {
          std::tuple<Args...> t(args...);
          if (cache.find(t) == cache.end())
             cache[t] = func(args...);
          return cache[t];
  });
}

std::function<int (int)> f;
int fib(int n) {
  if  (n < 2) return n;
  return f(n-1) + f(n-2);
}

std::function<int (int, int)> b;
int binomial(int n, int k) {
  if (k == 0 || n == k) return 1;
  return b(n-1, k) + b(n-1, k-1);
}

int main(void) {
  f = memoize(std::function<int (int)>(fib));
  std::cout << f(20) << std::endl;
  b = memoize(std::function<int (int, int)>(binomial));
  std::cout << b(34,15) << std::endl;
}
wojtekw
  • 41
  • 3
  • @Felics seems to be asking for an explanation of the code - in particular, the caching mechanism. Can you help? – Lizz Nov 13 '12 at 23:55
  • 1
    The explanation has been given by slackito [link](http://slackito.com/2011/03/17/automatic-memoization-in-cplusplus0x/). – wojtekw Nov 14 '12 at 00:12
2

To quote from the blog where you found this, just below the code:

... the equals sign in [=] means “capture local variables in the surrounding scope by value”, which is needed because we are returning the lambda function, and the local variable will disappear at that moment.

So, cache is copied into the returned function object as if it were a member.

(Note that this simple piece of code will fail to memoize a recursive function. Implementing a fixed-point combinator in C++0x is left as an exercise to the reader.)

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
0

Welcome to the wonderful world of lexical scoping. It can be used to create entire types with public and private members. In functional languages, it's often the only way to do that.

I suggest you read http://mark-story.com/posts/view/picking-up-javascript-closures-and-lexical-scoping, which is about Javascript, but C++0x adds the same concepts and (almost the same) behavior to C++.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720