27

Are LINQ expression trees proper trees, as in, graphs (directed or not, wikipedia does not seem too agree) without cycles? What is the root of an expression tree from the following C# expression?

(string s) => s.Length

The expression tree looks like this, with "->" denoting the name of the property of the node the other node is accessible through.

     ->Parameters[0]
 Lambda---------Parameter(string s)
    \               /
     \->Body       /->Expression
      \           /
      Member(Length)

When using ExpressionVisitor to visit the LambdaExpression, the ParameterExpression is visited twice. Is there a way to use the ExpressionVisitor to visit the LambdaExpression so that all the nodes are visited exactly once, and in a specific, well-known order (pre-order, in-order, post-order etc.)?

cynic
  • 5,305
  • 1
  • 24
  • 40
  • Why do you need this, why do you care? – Daniel Hilgarth Jan 24 '12 at 12:50
  • 4
    @DanielHilgarth I think this is a legitimate question about how the underlying concepts of Expression Trees work. This is a Q&A site, and it appears that the questioner is curious about how Expression Trees are working. – David Hoerster Jan 24 '12 at 13:02
  • May be this question is for polling but it's interesting. – Saeed Amiri Jan 24 '12 at 13:07
  • Does anyone know in what ramification of math Expressions Trees falls in? I would like to take a look in the mathematical concepts – Ortiga Jan 24 '12 at 13:11
  • @DanielHilgarth I am working around specific issues in the Entity Framework LINQ provider, and I need to know all the branches (root to leaf) of an expression tree to do so. – cynic Jan 24 '12 at 13:23
  • 1
    @DavidHoerster: It indeed is a legitimate question, but more often than not, people ask such questions trying to work around some problem. I would rather know that original problem and help fixing it. – Daniel Hilgarth Jan 24 '12 at 14:31

2 Answers2

17

Sort of, yes. The actual "trunk" (if you will) of a LambdaExpression is the .Body; the parameters are necessary metadata about the structure of the tree (and what it needs), but .Parameters at the top (your dotted line) isn't really part of the tree's functional graph - it is only when those nodes are used later in the actual body of the tree that they are interesting, as value substitutions.

The ParameterExpression being visited twice is essential, so that it is possible for someone to swap the parameters if they wanted - for example, to build an entire new LambdaExpression with the same number of parameters, but different parameter instances (maybe changing the type).

The order will be fairly stable, but should be considered an implementation detail. For example, given a node such as Add(A,B), it should make no semantic difference whether I visit that A-first vs B-first.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • Thanks. It would be nice to know which properties of the Expression classes constitute "proper" edges in the expression trees. – cynic Jan 24 '12 at 13:21
  • @cynic pretty much anything that is an `Expression` that isn't `LambdaExpression.Parameters` ! I can't think of any others off the top of my head, although I'm only really considering 3.5-style expressions; there may be some other similar metadata values on some of the 4.0 node-types... – Marc Gravell Jan 24 '12 at 13:32
17

Just to add a bit to Marc's correct answer:

Are LINQ expression trees directed graphs without cycles?

First off, yes, an expression tree is a DAG -- a directed acyclic graph.

We know they are acyclic because expression trees are immutable, and therefore have to be built from the leaves up. In such a situation there is no way to make a cycle because all the nodes in the cycle would have to be allocated last, and clearly that's not going to happen.

Because the parts are immutable, the expression "tree" need not actually be a tree per se. As Marc points out, it is required that you re-use the reference for the parameter; that's how we determine when a declared parameter is used. It is somewhat strange, though legal, to re-use other parts too. For example, if you wanted to represent the expression tree for the body of (int x)=>(x + 1) * (x + 1), you could make an expression tree for (x + 1) and then make a multiplication node where both children were that expression tree.

When using ExpressionVisitor to visit the LambdaExpression, the ParameterExpression is visited twice. Is there a way to use the ExpressionVisitor to visit the LambdaExpression so that all the nodes are visited exactly once, and in a specific, well-known order (pre-order, in-order, post-order etc.)?

ExpressionVisitor is an abstract class. You can make your own concrete version of it that has the semantics you like. For example, you can override the Visit method such that it maintains a HashSet of nodes already seen, and does not call Accept on nodes that it has previously accepted.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • Thanks @ericlippert, this makes it more clear. One additional question remains: Is it ok to reuse the same parameter instance in a second (not nested) lambda and even have it in the second lambdas Parameters collection? lambda1 : (x) => x.Y lambda2 : (x) => x.Z used in source.Where(lambda1).OrderBy(lambda2) That is something C# LINQ will not produce. But is it considered a valid expression tree? – Tom67 Sep 19 '13 at 17:42
  • 1
    @Tom67: **This is a question-and-answer site.** Post that question! – Eric Lippert Sep 19 '13 at 22:57
  • 1
    Thanks @ericlippert, now I hope to get the final answer [here](http://stackoverflow.com/questions/18911304/should-linq-lambda-expression-parameters-be-reused-in-a-second-lambda). – Tom67 Sep 20 '13 at 07:37