Composition of a hierarchy of functions

Question

Is there a canonical way to express a function that is a composition of a rooted tree of functions?

Here is a concrete example of what I mean by "composition of a tree of functions." Take a rooted tree whose nodes are labelled by functions, like so:

Each function at a node is a composition of the functions at its child nodes. The function that is associated to the tree, itself, is the composition

F = a0(b0(c0(e0, e1, e2)), b1(d0(f0), d1(g0, g1)))

More explicitly, F is a function of 6 arguments that are evaluated by the functions at the leaves:

F(x0, ... , x5) == a0(b0(c0(e0(x0), e1(x1), e2(x2))),
                      b1(d0(f0(x3)), d1(g0(x4), g1(x5))))

General question

Given a rooted tree T, and a list L of functions corresponding to the nodes of T, is there a canonical way to write a function F of the arguments T and L that returns the composition of the functions in L structured according to the tree T?

In this way, the "wiring" of the composition—the tree T—is separated from its internal "components"—the list L. A "canonical" solution should include, in particular, representations of T and L that are naturally adapted to this problem.

I suspect that this problem has a trivial solution in a functional programming language, but ideally I would like to have a solution in a dynamically-typed imperative language like Python, something like

def treecomp(tree, list_of_funcs):
    ...
    return function

F = treecomp(T, L)

Addendum

In the meantime, I came up with my own solution (posted below).

While I am satisfied with its economy and conceptual simplicity, I would nevertheless be interested in other essentially different approaches, especially those that leverage strengths in another language that are either lacking or poorly supported in Python.

Hunch

With proper data structures—that don't essentially reproduce the desired output!—functional-programming idioms should enable a very short solution.

@EugeneHa I am using a dict of funtions instead of a List. Is that okay? — Jan, Mar 25 '16 at 16:51
@JeD: A dict would be quite reasonable, but lists are, somehow, more convenient. In the example, both `L` and `T` can be expressed as lexicographic lists: `L = [a0, b0, b1, c0, ... , g1]` and `T = [(0,), (0,0), (0,1), (0,0,0), ... , (0,1,1,1)]`. (Think section numbers in a book.) Then the function-labelled tree is the list `TL = list(zip(T, L))`. So one way to proceed in writing `treecomp` would be to unravel the (implicit) tree structure in `TL` and compose functions, recursively. But I'm not sure whether I'm already buggered if I choose `T` and `L` to be lists!—natural though that seems. — egnha, Mar 25 '16 at 17:23
@JeD: I am not buggered; cf. my solution, which is still clumsy, however. :( — egnha, Mar 28 '16 at 07:57

score 2 · Answer 1 · answered Mar 25 '16 at 15:46

Right, so this sounds like an interesting thing to me. So I've gave it a try and here's the result.

class Node(object):
    def __init__(self, parents, fn):
        self.parents = parents
        self.fn = fn

    def get_number_of_args(self):
        if not self.parents:
            return 1
        if not hasattr(self, '_cached_no_args'):
            self._cached_no_args = sum(
                parent.get_number_of_args() for parent in self.parents
            )
        return self._cached_no_args

    def compose(self):
        if not self.parents:
            def composition(*args):
                return self.fn(*args)
            return composition

        fns = []
        fns_args = []
        for parent in self.parents:
            fns.append(parent.compose())
            fns_args.append(parent.get_number_of_args())
        number_of_args = sum(fns_args)
        length = len(self.parents)

        def composition(*args):
            if len(args) != number_of_args:
                raise TypeError
            sub_args = []
            last_no_args = 0
            reached_no_args = 0
            for i in range(length):
                fn = fns[i]
                no_args = fns_args[i]
                reached_no_args += no_args
                args_cut = args[last_no_args: reached_no_args]
                sub_call = fn(*args_cut)
                sub_args.append(sub_call)
                last_no_args = no_args
            return self.fn(*sub_args)

        return composition

You didn't specify the way you implement the tree structure so I combined both nodes and functions into one structure (you can always do mapping yourself). Now the usage:

>>> def fn(x):
...     return x
>>> def fn2(x):
...    return 1
>>> def add(x, y):
...    return x + y
>>> n1 = Node([], fn)
>>> n2 = Node([], fn2)
>>> n3 = Node([n1, n2], add)
>>> fn = n3.compose()
>>> print(fn(5, 7))
6

as expected. Feel free to test it (I actually haven't tried it on deeper tree) and let me know if you find any issues.

Thanks for your input. As it turns out, it isn't necessary to do any bookkeeping of node count. — egnha, Apr 03 '16 at 18:56

egnha · Answer 2 · 2016-07-24T07:03:24.147

We define a function treecomp that returns the composition of a list of functions L structured according to a rooted tree T, by taking L and T as separate arguments:

F = treecomp(T, L)

Unlike the other solutions proposed thus far, it isn't complicated by unnecessary bookkeeping, such as tracking the number of leaves or arguments (which is better handled by decorators, besides).

A simple construction of `treecomp`

A straightforward realization of treecomp is the following: it merely generates a symbolic (string) expression of the tree composition. Evaluation on arguments is then just a matter of plugging them in and evaluating the resulting expression.

This naive idea can be implemented using fairly basic data structures: lists for the tree and functions, and a simple class for the function-labeled tree. (Namedtuples would also do. However, by using a class with special comparison methods, we can write more semantically natural code.)

Data structures

The most economical encoding of a rooted tree as a flat list is as a list of "node addresses." In a comment to @JeD, I hinted that this could be done by "drawing" the tree:

T = [(0,),
         (0, 0),
             (0, 0, 0),
                 (0, 0, 0, 0), (0, 0, 0, 1), (0, 0, 0, 2),
         (0, 1),
             (0, 1, 0),
                 (0, 1, 0, 0),
             (0, 1, 1),
                 (0, 1, 1, 0), (0, 1, 1, 1)]

Here (0,) is the node corresponding to a0, (0, 0) is the node corresponding to b0, (0, 1) is the node corresponding to b1, and so forth, like the numbering of sections in a book. The longest (or "highest") tuples are the leaves.

The list of functions L can then be given as a list matching the order of nodes in T:

L = [a0, b0, c0, e0, e1, e2, b1, d0, f0, d1, g0, g1]

Since the nodes of the tree T are labeled by the functions in L, it will be convenient to have a data structure for that. We define a class that records a node's address and the literal name of the function labeling it; its methods implement comparisons relative to the partial ordering of the tree (where the root is the minimum element):

class SymbNode:
    '''Class that records a node's address and symbol.'''

    def __init__(self, addr, symb):
        self.addr = addr
        self.symb = symb

    def __len__(self): # how "high" a node is above the root
        return len(self.addr)

    def _compare(self, other, segment):
        return self.addr == other.addr[:segment]

    def __le__(self, other):
        return self._compare(other, segment=len(self))

    def begets(self, other):
        return self._compare(other, segment=-1)

Implementation

The simple two-step mechanism of treecomp is implemented below. By normalizing the order of the list of SymbNodes, we can build up a symbolic expression by simply "peeling off" each layer of the tree as we move up it.

from functools import partial
from operator import attrgetter

def treecomp(tree, funcs):
    '''Returns the composition of a tree of functions.'''
    symbtree = makesymbtree(tree, funcs)
    symbexp = makesymbexp(symbtree)
    return partial(evalsymbexp, symbexp=symbexp)

FUNC_CALL = '{func}({{}})'

def makesymbtree(tree, funcs):
    '''Returns the symbolic expression of a tree composition.'''
    symbols = [FUNC_CALL.format(func=func.__name__) for func in funcs]
    symbtree = sorted((SymbNode(*x) for x in zip(tree, symbols)),
                      key=attrgetter('addr'))
    symbtree.sort(key=len)
    return symbtree

def makesymbexp(symbtree):
    root = symbtree[0]
    if len(symbtree) == 1: # symbtree is a leaf node
        return root.symb
    symbargs = [makesymbexp(subsymbtree(symbtree, root=node))
                for node in symbtree if root.begets(node)]
    return root.symb.format(','.join(symbargs))

def subsymbtree(symbtree, root):
    subsymbtree = [node for node in symbtree if root <= node]
    return subsymbtree

ARGS = 'args[{idx}]'

def evalsymbexp(symbexp, *args):
    '''Returns the evaluation of a symbolic expression on arguments.'''
    argnames = [ARGS.format(idx=str(n)) for n, _ in enumerate(args)]
    return eval(symbexp.format(*argnames))

Verification

Because of the compartmentalization of treecomp, we only need to verify that the function makesymbexp generates the correct symbolic expression, and that the function evalsymbexp properly evaluates symbolic expressions.

The (essentially one-line) function evalsymbexp is supposed to take a string template and plug in the argument names 'args[0]', 'args[1]', etc., then evaluate the result. It evidently does that.

As for makesymbexp, we can gain confidence in its correctness, in lieu of a formal proof (which we eschew), by checking its output on some test data. Take, for example, the following functions:

def D(x): return 2*x
def M(x): return -x
def S(*xs): return sum(xs)

a0 = S
b0, b1 = D, S
c0, d0, d1 = S, D, S
e0, e1, e2, f0, g0, g1 = D, M, D, M, D, M

With T and L as above, we can check that we get the right symbolic expression:

makesymbexp(makesymbtree(T, L))

indeed yields the string

'S(D(S(D({}),M({}),D({}))),S(D(M({})),S(D({}),M({}))))'

To check the delegation of treecomp to evalsymbexp, as a partial function, I verified that the value of

F = treecomp(T, L)
F(x0, x1, x2, x3, x4, x5)

agreed with the value of

a0(b0(c0(e0(x0), e1(x1), e2(x2))), b1(d0(f0(x3)), d1(g0(x4), g1(x5))))

on 1000 random samples of x0, … , x5 drawn from the integers between -100 and 100.

mgilson · Answer 3 · 2016-03-25T16:06:04.180

Here's a simple example that I've cooked up:

from collections import deque

class Node(object):
    def __init__(self, children, func):
        self.children = children
        if children:
            self.leaf_count = sum(c.leaf_count for c in children)
        else:
            self.leaf_count = 1  # It is a leaf node.

        self.func = func

    def __call__(self, *args):
        if not self.children:
            assert len(args) == 1, 'leaf can only accept 1 argument.'
            return self.func(*args)  # Base case.

        d_args = deque(args)
        func_results = []
        for child in self.children:
            f_args = [d_args.popleft() for _ in xrange(child.leaf_count)]
            func_results.append(child(*f_args))
        assert not d_args, 'Called with the wrong number of arguments'
        return self.func(*func_results)

Basically, the "trick" is to keep track of how many leaf nodes each node has since the number of leaf nodes are the number of arguments that it is expected to accept.

If a node is a leaf, then simply call it's delegate function with the single input argument.
If a node is not a leaf, then call each child supplying the arguments according to the number of leaf nodes that are in the child's subtree.

A few implementation notes:

I used a collections.deque to pull the correct number args of to pass to the child. This is for efficiency since deque lets us take those arguments in O(1) time. Otherwise, we'd be left with something like:

for child in self.children:
    f_args = args[:child.leaf_count]
    args = args[child.leaf_count:]
    func_results.append(child(*args))

But that is O(N) time at each pass. For small trees it probably doesn't matter. For big trees it might matter a lot :-).

I also used a static member for the leaf_count which means that you need to build your tree from leaves to root. Of course you could use different strategies depending on the problem constraints. e.g. you could construct your tree and then fill in the leaf_count as a single pass after the tree is built before you start evaluating functions, or you could turn leaf_count into a function (@property) that counts the leaves on every call (which might get expensive for large trees).

And now some tests... The simplest case that I could think of was if the leaf nodes are all associated with the identity function and then the non-leaf nodes are a function that adds the input values. In this case, the result should always be the sum of the input values:

def my_sum(*args):
    return sum(args)

def identity(value):
    return value

e0, e1, e2, f0, g0, g1 = [Node([], identity) for _ in xrange(6)]
c0 = Node([e0, e1, e2], my_sum)
d0 = Node([f0], my_sum)
d1 = Node([g0, g1], my_sum)
b0 = Node([c0], my_sum)
b1 = Node([d0, d1], my_sum)
a0 = Node([b0, b1], my_sum)

arg_tests = [
    (1, 1, 1, 1, 1, 1),
    (1, 2, 3, 4, 5, 6)
]
for args in arg_tests:
    assert a0(*args) == sum(args)
print('Pass!')

Very clear and readable explanation of your implementation. As it turns out, the "trick" is unnecessary if you put the tree in a normalized form. (And if the user gives the wrong number of args, let the execution fail—better to ask for forgiveness than permission. ;) It looks like the way you have set up the labeling of the tree basically amounts to writing the composition by hand. — egnha, Mar 28 '16 at 07:53

score 1 · Answer 4 · answered Mar 25 '16 at 15:56

Since you want to decouple Functions and Trees you can do this:

#root=RootNode, funcs=Map from Node to function, values=list of inputs
#nodes need isLeaf and children field
def Func(root,funcs,values):
    #check if leaf
    if root.isLeaf:
        #removes value from list
        val=values.pop(0)
        #returns function of root
        #(can be identity if you just want to input values)
        return funcs[root](val)

    #else we do a recursive iteration:
    else:
        nextVals=[]
        #for each child
        for child in root.children:
            #call this function->DFS for roots, removes values that are used
            nextVals.append(Func(child,funcs,values))
        #unpack list and call function
        return funcs[root](*nextVals)

Here an example:

class Node:
    children=[]
    isLeaf=False

    def __init__(self,isLeaf):
        self.isLeaf=isLeaf

    def add(self,n):
        self.children.append(n)




def Func(root,funcs,values):
    #check if leaf
    if root.isLeaf:
        #removes value from list
        val=values.pop(0)
        #returns function of root
        #(can be identity if you just want to input values)
        return funcs[root](val)

    #else we do a recursive iteration:
    else:
        nextVals=[]
        #for each child
        for child in root.children:
            #call this function->DFS for roots, removes values that are used
            nextVals.append(Func(child,funcs,values))
        #unpack list and call function
        return funcs[root](*nextVals)


def sum3(a,b,c):
    return a+b+c


import math

funcMap={}
funcMap[root]=sum3

root=Node(False)
layer1=[Node(True) for i in range(3)]
for i in range(3):
    root.add(layer1[i])
    funcMap[layer1[i]]=math.sin




print Func(root,funcMap,[1,2,3])
print math.sin(1)+math.sin(2)+math.sin(3)

this returns the same value (Using python 2.7)

Sci Prog · Answer 5 · 2016-03-25T15:47:30.490

This would be a good candidate for object-oriented programming (OOP). For example, you could use these three classes

Node
Leaf
Tree

For working with tree structures, recursive methods are usually easier.

Alternatively, you can also directly build recursive structure by embedding tuples within tuples. For example

n1 = ( 'L', 'e0' )
n2 = ( 'L', 'e1' )
n3 = ( 'L', 'e2' )
n4 = ( 'N', 'c0', n1, n2, n3 )
n5 = ( 'N', 'b0', n4 )

This is not your complete tree, but it can be easily expanded. Just use print (n5) to see the result.

This is not the only way, there could be variations. For each tuple, the first item is a letter specifying if it is a leaf "L" or a node "N"--this will make it easier for recursive functions. The second item is the name (taken from your drawing). For a node, the other items are the children nodes.

(n.b. I once used the "tuples within tuples" to implement the Huffmann encoding algorithm--it also works with a tree structure).