In an C#-4.0 application, I have a Dictionary of strongly typed ILists having the same length - a dynamically strongly typed column based table. I want the user to provide one or more (python-)expressions based on the available columns that will be aggregated over all rows. In a static context it would be:
IDictionary<string, IList> table;
// ...
IList<int> a = table["a"] as IList<int>;
IList<int> b = table["b"] as IList<int>;
double sum = 0;
for (int i = 0; i < n; i++)
sum += (double)a[i] / b[i]; // Expression to sum up
For n = 10^7 this runs in 0.270 sec on my laptop (win7 x64). Replacing the expression by a delegate with two int arguments it takes 0.580 sec, for a nontyped delegate 1.19 sec. Creating the delegate from IronPython with
IDictionary<string, IList> table;
// ...
var options = new Dictionary<string, object>();
options["DivisionOptions"] = PythonDivisionOptions.New;
var engine = Python.CreateEngine(options);
string expr = "a / b";
Func<int, int, double> f = engine.Execute("lambda a, b : " + expr);
IList<int> a = table["a"] as IList<int>;
IList<int> b = table["b"] as IList<int>;
double sum = 0;
for (int i = 0; i < n; i++)
sum += f(a[i], b[i]);
it takes 3.2 sec (and 5.1 sec with Func<object, object, object>
) - factor 4 to 5.5. Is this the expected overhead for what I'm doing? What could be improved?
If I have many columns, the approach chosen above will not be sufficient any more. One solution could be to determine the required columns for each expression and use only those as arguments. The other solution I've unsuccessfully tried was using a ScriptScope and dynamically resolve the columns. For that I defined a RowIterator that has a RowIndex for the active row and a property for each column.
class RowIterator
{
IList<int> la;
IList<int> lb;
public RowIterator(IList<int> a, IList<int> b)
{
this.la = a;
this.lb = b;
}
public int RowIndex { get; set; }
public int a { get { return la[RowIndex]; } }
public int b { get { return lb[RowIndex]; } }
}
A ScriptScope can be created from a IDynamicMetaObjectProvider, which I expected to be implemented by C#'s dynamic - but at runtime engine.CreateScope(IDictionary) is trying to be called, which fails.
dynamic iterator = new RowIterator(a, b) as dynamic;
var scope = engine.CreateScope(iterator);
var expr = engine.CreateScriptSourceFromString("a / b").Compile();
double sum = 0;
for (int i = 0; i < n; i++)
{
iterator.Index = i;
sum += expr.Execute<double>(scope);
}
Next I tried to let RowIterator inherit from DynamicObject and made it to a running example - with terrible performance: 158 sec.
class DynamicRowIterator : DynamicObject
{
Dictionary<string, object> members = new Dictionary<string, object>();
IList<int> la;
IList<int> lb;
public DynamicRowIterator(IList<int> a, IList<int> b)
{
this.la = a;
this.lb = b;
}
public int RowIndex { get; set; }
public int a { get { return la[RowIndex]; } }
public int b { get { return lb[RowIndex]; } }
public override bool TryGetMember(GetMemberBinder binder, out object result)
{
if (binder.Name == "a") // Why does this happen?
{
result = this.a;
return true;
}
if (binder.Name == "b")
{
result = this.b;
return true;
}
if (base.TryGetMember(binder, out result))
return true;
if (members.TryGetValue(binder.Name, out result))
return true;
return false;
}
public override bool TrySetMember(SetMemberBinder binder, object value)
{
if (base.TrySetMember(binder, value))
return true;
members[binder.Name] = value;
return true;
}
}
I was surprised that TryGetMember is called with the name of the properties. From the documentation I would have expected that TryGetMember would only be called for undefined properties.
Probably for a sensible performance I would need to implement IDynamicMetaObjectProvider for my RowIterator to make use of dynamic CallSites, but couldn't find a suited example for me to start with. In my experiments I didn't know how to handle __builtins__
in BindGetMember:
class Iterator : IDynamicMetaObjectProvider
{
IList<int> la;
IList<int> lb;
public Iterator(IList<int> a, IList<int> b)
{
this.la = a;
this.lb = b;
}
public int RowIndex { get; set; }
public int a { get { return la[RowIndex]; } }
public int b { get { return lb[RowIndex]; } }
public DynamicMetaObject GetMetaObject(Expression parameter)
{
return new MetaObject(parameter, this);
}
private class MetaObject : DynamicMetaObject
{
internal MetaObject(Expression parameter, Iterator self)
: base(parameter, BindingRestrictions.Empty, self) { }
public override DynamicMetaObject BindGetMember(GetMemberBinder binder)
{
switch (binder.Name)
{
case "a":
case "b":
Type type = typeof(Iterator);
string methodName = binder.Name;
Expression[] parameters = new Expression[]
{
Expression.Constant(binder.Name)
};
return new DynamicMetaObject(
Expression.Call(
Expression.Convert(Expression, LimitType),
type.GetMethod(methodName),
parameters),
BindingRestrictions.GetTypeRestriction(Expression, LimitType));
default:
return base.BindGetMember(binder);
}
}
}
}
I'm sure my code above is suboptimal, at least it doesn't handle the IDictionary of columns yet. I would be grateful for any advices on how to improve design and/or performance.