out-of-core/external-memory combinatorics in python

Question

I am iterating the search space of valid Python3 ASTs. With max recursion depth = 3, my laptop runs out of memory. My implementation makes heavy use of generators, specifically 'yield' and itertools.product().

Ideally, I'd replace product() and the max recursion depth with some sort of iterative deepening, but first things first: Are there any libraries or useful SO posts for out-of-core/external-memory combinatorics?

If not... I am considering the feasibility of using either dask or joblib's delayed()... or perhaps wendelin-core's ZBigArray, though I don't like the looks of its interface:

root = dbopen('test.fs')
root['A'] = A = ZBigArray((10,), np.int)
transaction.commit()

Based on this example, I think that my solution would involve an annotation/wrapper function that eagerly converts the generators to ZBigArrays, replacing root['A'] with something like root[get_key(function_name, *function_args)] It's not pretty, since my generators are not entirely pure--the output is shuffled. In my current version, this shouldn't be a big deal, but the previous and next versions involve using various NNs and RL rather mere shuffling.

Is this an open-source project (with say a github/gitlab repo)? As someone who has spent a lot of time poking around Python's AST module and parsing logic, this project has my interest — Dillon Davis, Feb 02 '22 at 06:51
I have literally >100 projects that need to be released. Thanks for the push I needed: [link](https://github.com/InnovAnon-Inc/ProgramSynthesis) — Innovations Anonymous, Feb 02 '22 at 16:38

score 1 · Accepted Answer · answered Feb 02 '22 at 06:50

First things first- the reason you're getting the out of memory error is because itertools.product() caches intermediate values. It has no idea if the function that gave you your generator is idempotent, and even if it did, it couldn't be able to infer how to call it again given just the generator. This means itertools.product must cache values of each iterable its passed.

The solution here is to bite the small performance bullet and either write explicit for loops, or write your own cartesian product function, which takes functions that would produce each generator. For instance:

def product(*funcs, repeat=None):
    if not funcs:
        yield ()
        return

    if repeat is not None:
        funcs *= repeat

    func, *rest = funcs
    
    for val in func():
        for res in product(*rest):
            yield (val, ) + res

from functools import partial
values = product(partial(gen1, arg1, arg2), partial(gen2, arg1))

The bonus from rolling your own here is that you can also change how it goes about traversing the A by B by C ... dimensional search space, so that you could do maybe a breadth-first search instead of an iteratively deepening DFS. Or, maybe pick some random space-filling curve, such as the Hilbert Curve which would iterate all indices/depths of each dimension in your product() in a local-centric fashion.

Apart from that, I have one more thing to point out- you can also implement BFS lazily (using generators) to avoid building a queue that could bloat memory usage as well. See this SO answer, copied below for convenience:

def breadth_first(self):
    yield self
    for c in self.breadth_first():
        if not c.children:
            return  # stop the recursion as soon as we hit a leaf
        yield from c.children

Overall, you will take a performance hit from using semi-coroutines, with zero caching, all in python-land (in comparison to the baked in and heavily optimized C of CPython). However, it should still be doable- algorithmic optimizations (avoiding generating semantically nonsensical ASTs, prioritizing ASTs that suit your goal, etc.) will have a larger impact than the constant-factor performance hit.

I was falsely assuming that product's "actual implementation does not build up intermediate results in memory," but replacing it with your implementation seems to have fixed it. Good thing, too, because my attempts at implementing an @out_of_core decorator with e.g., dask, wendelin, joblib, weren't panning out as I expected. I'm now trying to wrap my head around the idea of using a space-filling curve for this. I hypothesize that somehow including "locality of data points" should improve NN performance. — Innovations Anonymous, Feb 02 '22 at 22:04

out-of-core/external-memory combinatorics in python

1 Answers1