9

I am writing a class that has some computation-heavy methods and some parameters that the user will want to iteratively tweak and are independent of the computation.

The actual use is for visualization, but here's a cartoon example:

class MyClass(object):

    def __init__(self, x, name, mem=None):

        self.x = x
        self.name = name
        if mem is not None:
            self.square = mem.cache(self.square)

    def square(self, x):
        """This is the 'computation heavy' method."""
        return x ** 2

    def report(self):
        """Use the results of the computation and a tweakable parameter."""
        print "Here you go, %s" % self.name
        return self.square(self.x)

The basic idea is that the user might want to create many instances of this class with the same x but different name parameters. I want to allow the user to provide a joblib.Memory object that will cache the computation part, so they can "report" to lots of different names without recomputing the squared array each time.

(This is a little weird, I know. The reason the user needs a different class instance for each name is that they'll actually be interacting with an interface function that looks like this.

def myfunc(x, name, mem=None):
    theclass = MyClass(x, name, mem)
    theclass.report()

But let's ignore that for now).


Following the joblib docs I am caching the square function with the line self.square = mem.cache(self.square). The problem is that, because self will be different for different instances, the array gets recomputed every time even when the argument is the same.

I am guessing that the correct way to handle this is changing the line to

self.square = mem.cache(self.square, ignore=["self"])

However, are there any drawbacks to this approach? Is there a better way to accomplish the caching?

Kevin J. Chase
  • 3,856
  • 4
  • 21
  • 43
mwaskom
  • 46,693
  • 16
  • 125
  • 127
  • Were you able to resolve this issue? Or do we simply follow the docs? – wadkar Apr 26 '16 at 00:47
  • Now that I think about it, the docs give the generic approach which must allow for the case when invocation of `square` could yield different results _even with same arguments_ on different instances of the `MyClass`. The `square` method as you described would be a `@staticmethod` because it looks like calling the method with same arguments does not change the result. This can be achieved by annotating with `@staticmethod` and making sure the definition does not have `self` as argument, e.g. `@staticmethod #newline def square(x):` – wadkar Apr 26 '16 at 01:12

1 Answers1

1

From the docs,

If you want to use cache inside a class the recommended pattern is to cache a pure function and use the cached function inside your class.

Since you want memory caching to be optional, I recommend something like this:

def square_function(x):
    """This is the 'computation heavy' method."""
    print '    square_function is executing, not using cached result.'
    return x ** 2

class MyClass(object):

    def __init__(self, x, name, mem=None):
        self.x = x
        self.name = name
        if mem is not None:
            self._square_function = mem.cache(square_function)
        else:
            self._square_function = square_function

    def square(self, x):
        return self._square_function(x)

    def report(self):
        print "Here you go, %s" % self.name
        return self.square(self.x)


from tempfile import mkdtemp
cachedir = mkdtemp()

from joblib import Memory
memory = Memory(cachedir=cachedir, verbose=0)

objects = [
    MyClass(42, 'Alice (cache)', memory),
    MyClass(42, 'Bob (cache)', memory),
    MyClass(42, 'Charlie (no cache)')
]

for obj in objects:
    print obj.report()

Execution yields:

Here you go, Alice (cache)
    square_function is executing, not using cached result.
1764
Here you go, Bob (cache)
1764
Here you go, Charlie (no cache)
    square_function is executing, not using cached result.
1764
Yu-Lin Chen
  • 559
  • 5
  • 12
JosiahJohnston
  • 103
  • 1
  • 9