How to do Expression Trees on complex datatypes

Question

I must admit that I'm absolutely new to Expression Trees in C#, but currently there is a necessity to get used to it. My problem is that I have two DataTypes that contain an array of arrays. For that reason I created an interface which is called IArray<>

    /// <summary>
    /// Interface for all classes that provide a list of IArrayData
    /// </summary>
    public interface IArray<ElementDataType>
        where ElementDataType : struct
    {
        /// <summary>
        /// Returns a single data out fo the array container.
        /// </summary>
        /// <param name="index"></param>
        /// <returns></returns>
        IArrayData<ElementDataType> GetElementData(int index);

        /// <summary>
        /// Returns the amount of ArrayData.
        /// </summary>
        int Count
        {
            get;
        }

        /// <summary>
        /// Returns the size of a single dataset in number of elements
        /// (not in bytes!).
        /// </summary>
        /// <returns></returns>
        int GetDataSize();

        /// <summary>
        /// Creates a copy of the array data provider, by copying the metainformation, but not the
        /// contents.
        /// </summary>
        /// <returns></returns>
        IArray<ElementDataType> CloneStub();
    }

IArrayData is defined as following:

   /// <summary>
    /// DataProvider that provides the internal data as an array.
    /// </summary>
    /// <typeparam name="NativeDataType"></typeparam>
    public interface IArrayData<NativeDataType> where NativeDataType : struct
    {
        /// <summary>
        /// Returns the data in an arbitrary format.
        /// The implementor must take care of an appropriate cast.
        /// </summary>
        /// <typeparam name="T"></typeparam>
        /// <returns></returns>
        T[] GetData<T>() where T : struct;

        /// <summary>
        /// Returns the data as float[].
        /// </summary>
        /// <returns></returns>
        float[] GetData();

        /// <summary>
        /// Sets the data in an arbitrary format.
        /// The implementor must take care of an appropriate cast.
        /// </summary>
        /// <typeparam name="T"></typeparam>
        /// <param name="data_in"></param>
        void SetData<T>(T[] data_in) where T : struct;

        /// <summary>
        /// Sets the data as float[].
        /// </summary>
        /// <param name="data_in"></param>
        void SetData(float[] data_in);
    }

There are two types that implement the interface. On most of the data I have to perform mathematical operations on the data. Currently I have created my own expression evaluator, but I would love to use the expression trees because it seems to me that it's more flexible. How would I implement a mathematical operation like +, - on the given interface?

One type I have is what I call VoxelVolume, which is to store a 3d block of image data. VoxelVolume implements IArray:

public abstract class VoxelVolume : IArray<float>
{
}

Lets assume I have 3 VoxelVolumes A, B, and C. Now I want to perform an operation:

VoxelVolume D = (A + B) * C;

Currently I'm doing this with operator overloading and it works quite well. The only problem is, that the expression is evaluated operation by operation and the longer the expression is the more time and memory it takes. I would prefer to combine the operation in one single step. This is what my current implementation does

public static IArray<float> AddMul(IArray<float> A, IArray<float> B, IArray<float> C)
{
    IArray<float> Result = A.CloneStub();

    int L = A.Count;
    int N = A.GetDataSize();

    for (int l = 0; l < L; l++)
    {
        float[] a = A.GetElementData(l).GetData();
        float[] b = B.GetElementData(l).GetData();
        float[] c = C.GetElementData(l).GetData();
        float[] d = new float[a.Length];

        for (int n = 0; n < N; n++)
        {
            d[n] = (a[n] + b[n]) * c[n];
        }

        Result.GetElementData(l).SetData(d);
    }

    return Result;
}

But as you may recognize I have to type a lot for all different operations +,-,*,/ and a lot more. For that reasons I'd like to have a more generic and flexible way to perform this operations.

Thanks Martin

Do you have an example, i.e. a calculation you would want the expression to represent? Perhaps write in C# what you would want the expression to illustrate? — Marc Gravell, Jul 25 '11 at 07:59
Hi Marc, I have provided an example. If you need more information or details, please just ask. Thanks Martin — msedi, Jul 25 '11 at 08:15
In the example, why is `A.GetElementData(l).GetData()` evaluated 3 times? — Marc Gravell, Jul 25 '11 at 08:28
I'm sorry. This was a copy and paste error. I corrected it in the example. — msedi, Jul 25 '11 at 08:29

score 2 · Accepted Answer · answered Jul 25 '11 at 08:40

In your example, I'm assuming that it would be reasonable to run the GetData() calls in your standard code, although since we might not know the "depth", we can simplify to a jagged array (a rectangular array would work too, but is harder to work with at all points, so let's not do that). So imagine we have a jagged array instead of a,b,c (although we'll assume the answers d is simple enough). Thus, we actually need to build:

= ((jagged[0])[n] + (jagged[1])[n]) * (jagged[2])[n]

which is, as a tree:

= *( +((jagged[0])[n], (jagged[1])[n]), (jagged[2])[n])

So we can build an expression (and evaluate), via:

var data = Expression.Parameter(typeof (float[][]));
var n = Expression.Parameter(typeof (int));

var body = 
    Expression.Multiply( // *
        Expression.Add( // +
            Expression.ArrayIndex(
                Expression.ArrayIndex(data, Expression.Constant(0)), n), // a[n]
            Expression.ArrayIndex(
                Expression.ArrayIndex(data, Expression.Constant(1)), n) // b[n]
            ),
        Expression.ArrayIndex(
            Expression.ArrayIndex(data, Expression.Constant(2)), n)  // c[n]
    );
var func = Expression.Lambda<Func<float[][], int, float>>(body, data, n).Compile();


// here, a===jagged[0], b===jagged[1], c===jagged[2]
float[][] jagged = new[] { new[] { 1F, 2F }, new[] { 3F, 4F }, new[] { 5F, 6F } };
for(int i = 0; i < 2; i++)
{
    Console.WriteLine("{0}: {1}", i, func(jagged, i));
}

Obviously your actual code needs to parse your input expression and build a similar tree flexibly; but this should be enough to illustrate a general approach.

Hi Marc, thank you very much this seems to me a first step to get rid of my code. I have to dig a little bit more in the details of expression trees, but this gives me a first clue. Do you know if the generated expressions are cached somehow or do I have to do it myself? — msedi, Jul 25 '11 at 08:53
@msedi if you create it like the above there is no cache. If you write an expression in C# via a lambda, the C# compiler will *sometimes* inject a cache as a static field, but only if it can prove to itself that the expression isn't impacted by closures, context, etc. But that does not apply in this case. — Marc Gravell, Jul 25 '11 at 08:58

How to do Expression Trees on complex datatypes

1 Answers1