How to implement graph data structure consisting of ndarray arrays?

Question

I wanted to implement some computational graph data structure in Rust (that can hopefully support autodiff), and I decided to use the ndarray crate, since that is most similar to NumPy. Making a graph data structure isn't really all that difficult, and involves either using a Vec<Node> where each Node contains references to other vector elements, or involves a node that has reference counted pointers to other nodes.

However, this entirely fails when using ndarray. Suppose we have a fairly common type, which is Array<T, D>. This is an n-dim array that owns its data, and has two generic parameters: T for the type (e.g. f32, i32, Complex<f32>, etc.) and D for the number of dimensions (could be dynamic using IxDyn). How can I create a useful graph data structure where each single node has different generics? For instance, how can I make a graph where the one node has some D and T, and the node that it points to has a D and T that are of different types?

You might be asking, why would I need that? Why not fix T to some value that makes sense, like f32, and use dynamic sized arrays? The problem is that I would like to support transformations from real numbers to complex numbers, such as FFT.

So given that you want to implement some variant of reverse mode automatic differentiation on tensors, how would you implement the computational graph?

This is one of my attempts:

use std::{cell::RefCell, ops::Add, rc::Rc};
use ndarray::prelude::*;
use num_traits::One;

pub enum OpType {
    Add,
    MatVecMul,
    Leaf
}

pub struct Expr<'t, T, D> {
    index: usize,
    tape: &'t Tape,
    output: Rc<Array<T, D>>
}

// Note how I have to give Node generic types due to the Array
pub struct Node<T, D> {
    value: Rc<Array<T, D>>,
    op: OpType,
    deps: [usize; 2],
}

// This is causing all the problems. We cannot return the array from this trait.
pub trait TapeNode {

}

impl <T, D> TapeNode for Node<T, D> {

}

pub struct Tape {
    nodes: RefCell<Vec<Box<dyn TapeNode>>>
}

impl Tape {
    pub fn new() -> Self {
        Tape {
            nodes: RefCell::new(Vec::new())
        }
    }

    pub fn push_leaf<'t, T, D>(&'t self, value: Array<T, D>) -> Expr<'t, T, D>
    where
        T: 'static,
        D: 'static
    {
        let mut nodes = self.nodes.borrow_mut();
        let value_ref = Rc::from(value);
        let new_node = Node {
            value: value_ref.clone(),
            op: OpType::Leaf,
            deps: [0, 0],
        };

        let len = nodes.len();
        nodes.push(Box::from(new_node));

        Expr { index: len, tape: &self, output: value_ref.clone() }
    }

    pub fn push_1<'t, T, D>(&'t self, value: Array<T, D>, id_0: usize, op: OpType) -> Expr<'t, T, D> 
    where
        T: 'static,
        D: 'static
    {
        let mut nodes = self.nodes.borrow_mut();
        let value_ref = Rc::from(value);
        let new_node = Node {
            value: value_ref.clone(),
            op,
            deps: [id_0, 0],
        };

        let len = nodes.len();
        nodes.push(Box::from(new_node));

        Expr { index: len, tape: &self, output: value_ref.clone() }
    }

    pub fn push_2<'t, T, D>(&'t self, value: Array<T, D>, id_0: usize, id_1: usize, op: OpType) -> Expr<'t, T, D> 
    where
        T: 'static,
        D: 'static
    {
        let mut nodes = self.nodes.borrow_mut();
        let value_ref = Rc::from(value);
        let new_node = Node {
            value: value_ref.clone(),
            op,
            deps: [id_0, id_1],
        };

        let len = nodes.len();
        nodes.push(Box::from(new_node));

        Expr { index: len, tape: &self, output: value_ref.clone() }
    }
}

impl <'t, T, D> Add for Expr<'t, T, D> 
where
    T: Add<T, Output = T> + Clone + One + 'static,
    D: Dimension + 'static
{
    type Output = Self;
    fn add(self, rhs: Self) -> Self::Output {
        let added = self.output.as_ref() + rhs.output.as_ref();
        self.tape.push_2(added, self.index, rhs.index, OpType::Add)
    }
}


/*
Problem happens with backpropagation. For that, we must iterate through
the tape backwards and apply the chain rule repeatedly. This requires
being able to get the array at each node, and that fails.
*/

This code is inspired by this blog post:

I actually did get such a graph working, where there is a central list of pointers to a bunch of dynamically dispatched nodes (this is what the Tape struct does). What's the problem, you say? Well, given an index, suppose I want to retrieve some array from a Tape. How would I do that? Because the tape stores pointers to trait objects, should we just add a getter method to the trait definition? No! If we try to return an ndarray Array with generic parameters and add that function to the trait definition, we will create a trait that is not object safe. So, this whole system falls apart. To be honest, it's a bit frustrating.

Is there any way to make this system work, or does the entire approach need to change? If so, what approach would you use? I know there is some approach that works, because the neuronika crate is able to generate a computational graph with nodes of differing generic types.

EDIT: After reviewing how Neuronika's implementation is able to function with ndarray generics (answer posted below), how would you approach this problem? Would you try a different approach, or would you just stick to some variant of this one?

Since you know neuronika does it, you can go to their source code and see _how_ they do it: https://github.com/neuronika/neuronika — Paul Dempsey, Mar 17 '23 at 00:36

score 0 · Answer 1 · answered Mar 18 '23 at 06:22

Alright, so I decided to answer my own question. I think it was right under my nose. I have a list of trait objects that happen to represent the operations of the computational graph. Performing a forward pass entails iterating through the list of nodes, and at each node, perform the operation given the specified dependencies.

The problem was that each forward step would have required the retrieval of some struct or perhaps a reference to a struct containing generics, which is obviously not possible.

So instead, after looking at Neuronika's implementation, I was able to distill the idea that a node need not store references to other nodes. Instead, a node can just store references to the arrays that other nodes output. Here is an example of what the structure could resemble:

use std::{rc::Rc, cell::RefCell, ops::Add};

use ndarray::{prelude::*, Zip};

/* For each operation, we would have to make a struct. */
pub struct AddOp <T, D> 
where
    D: Dimension,
{
    left_dat: Rc<RefCell<Array<T, D>>>,
    right_dat: Rc<RefCell<Array<T, D>>>,
    output: Rc<RefCell<Array<T, D>>>,
}

/* Notice how no function is allowed to send or receive data */
pub trait Operation {
    fn forward(&self);
}

/* Pretty much just stolen from Neuronika, but I couldn't really do this any better */
impl <T, D> Operation for AddOp<T, D>
where
    D: Dimension,
    T: Copy + Add<Output = T>
{
    fn forward(&self) {
        let left_ref = & *self.left_dat.borrow();
        let right_ref = & *self.right_dat.borrow();

        // let sum = left_arr + right_arr;
        let sum_ref = &mut *self.output.borrow_mut();
        Zip::from(sum_ref)
            .and(left_ref)
            .and(right_ref)
            .for_each(|o, &l, &r| {
                *o = l + r
            });
    }
}

pub struct Tape {
    /* Same exact concept as last time. */
    nodes: RefCell<Vec<Box<dyn Operation>>>,
}

impl Tape {
    pub fn forward(&self) {
        for node in self.nodes.borrow_mut()
                            .iter_mut()
                            .map(|box_ref| box_ref.as_mut()) {
            node.forward();
        }
    }
}

Note how in the Tape struct, we are still using the same layout. However, each node (now classified as some operation) can actually store the references of the operands. For example, the AddOp struct stores references to the left and right operand. We can still turn it into a trait object, because the information for forward() is self contained in the struct. Obviously, it isn't entirely self contained, since we are reference counted smart pointers, but the point is that we no longer have a need to retrieve any data from trait objects stored in Tape. It's weird, because the graph that is generated doesn't even explicitly resemble the way that we as humans envision a computational graph.

It might be inconvenient, but I guess we would have to design the system by avoiding data retrieval from Tape.

@Kaplan So I have actually made sort of a primitive version of forward and backward propagation of fully connected layers using `nalgebra::DMatrix` and `nalgebra::DVector`, and it worked pretty well, honestly. However, the main issue is that the interfaces for nalgebra don't really support higher dimensional tensors. For example, if we want to compute a convolution of an RGB image, we need to represent a 3-dimensional array. You could probably get nalgebra's arrays to work, but I just find ndarray to be much more elegant. — nekechs, Mar 19 '23 at 02:18

How to implement graph data structure consisting of ndarray arrays?

1 Answers1